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Abstract 

Background: Contemporary coral reef research has firmly established that a genomic approach is urgently needed 
to better understand the effects of anthropogenic environmental stress and global climate change on coral 
holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of 
the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a 
reef-building coral by applying advanced bioinformatics. 

Description: Sequences from the KEGG database of protein function were used to construct hidden Markov 
models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic 
annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for 
searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query 
words to protein attributes. We present features of the annotation that underpin the molecular structure of key 
processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental 
proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca 2+ -signalling proteins, 
(5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, 
(9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (1 1) microbial symbioses and 
pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical 
defensome and (15) coral epigenetics. 

Conclusions: We advocate that providing annotation in an open-access searchable database available to the public 
domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions 
of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of 
evolutionary, developmental, metabolic, and environmental perspectives. 
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Background 

All of the reef-building corals (Scleractinia; phylum 
Cnidaria) that create the vast calcium carbonate de- 
posits of coral reefs have evolved an endosymbiotic 
partnership with photosynthetic dinoflagellates of the 
genus Symbiodinium (Dinophyceae), commonly known 
as zooxanthellae, which reside within the gastrodermal 
cells of their scleractinian host [1-3]. Coral-algal symbiosis 
is a cooperative metabolic adaptation necessary for sur- 
vival in the shallow oligotrophic (nutrient-poor) waters of 
tropical and subtropical marine environments [4,5] that 
drives the productivity of coral reefs [6]. Coral reefs pro- 
vide habitat and trophic support for many thousands of 
marine species, the richness of which rival the biological 
biodiversity of tropical rainforests [7]. Underlying the basic 
requirements of corals for growth, reproduction and sur- 
vival are special needs to accommodate symbiont-specific 
host recognition, to control innate and responsive im- 
mune systems, and what is likely to emerge from future 
research is the extent to which the host is involved in 
direct regulation of its endosymbiont populations. 
Much is understood about the cellular biology of 
cnidarian-dinoflagellate symbiosis (reviewed in [8]), 
but less is known at the molecular level of coral symbiology. 
There is little opposition to the contention that envi- 
ronmental and anthropogenic disturbances are causing 
alarming losses to coral reefs ([9] and reference therein). 
Threats to productivity are being imposed by the disruption 
of coral symbiosis (apparent as "coral bleaching") caused in 
response to increasing thermal stress attributed to global 
warming [10,11], from an increase in stress-related coral 
disease [12-14], from the discharge of domestic and indus- 
trial wastes, pollutants from agricultural development and 
the transport of sediments in terrestrial runoff [15,16], and 
potentially from imminent declines in coral calcification 
owing to rising ocean acidification [17-19]. Accordingly, we 
require a better understanding of the molecular stress re- 
sponses and adaptive potential of corals. Such information 
is necessary to predict bleaching events and so better in- 
form effective management policies for the conservation of 
coral reef ecosystems [20-24] . 

To understand how coral holobionts respond to envi- 
ronmental change at the molecular level, the identification 
of genes that may respond by transcription to stress is of 
primary importance [25]. Thus, the use of transcriptomic 
methodologies to identify stress-responsive genes has been 
highly successful [26-32]. Transcriptome high-throughput 
profiling has allowed changes in gene expression across 
thousands of genes to be measured simultaneously. Fuel- 
led by data-generating power, the number of coral based 
studies utilising transcriptomics to investigate molecular 
responses to environmental stressors has expanded greatly 
by the acquisition of expressed sequence tag (EST) gene li- 
braries, the fabrication of microarray biochips used to 



estimate levels of mRNA expression, and by direct analysis 
using next-generation, high-throughput sequencing. How- 
ever, much of this work has been conducted using the 
aposymbiotic state of pre-settlement coral larvae, so 
transcribed genes relevant to metamorphosis and the 
cytobiology of the adult polyp are limited to a few recent 
studies [33-36]. The transcriptome additionally does not 
provide the structural framework and essential regulatory 
elements of the functional genome for comprehensive 
evaluation. Recently, deep metatranscriptomic sequencing 
of two adult coral holobiomes has been made available 
on searchable databases: PocilloporaBase for Pocillopom 
damicornis [36] and PcarnBase for Platygyra carnosus 
[37] . In contrast, high-throughput metaproteomic analyses 
to quantify the product yield of stress-response genes of 
the coral holobiome are yet to be widely adopted by the 
coral reef scientific community, despite the proteome be- 
ing the ultimate measure of the coral phenotype [38,39]. 

The early accumulation of transcriptomic data revealed 
that a small proportion of coral ESTs matched genes 
known previously only from other kingdoms of life, imply- 
ing that the ancestral animal genome contained many 
genes traditionally regarded as 'non-animal' that have been 
lost from most animal genomes [40] . Furthermore, an un- 
expected revelation from EST data is the greater extent to 
which coral sequences resemble human genes than those 
of the Drosophila and Caenorhabditis model invertebrate 
genomes [41,42]. Comparative genomic analysis has 
revealed higher genetic divergence and massive gene loss 
within the ecdysozoan lineages. Hence, many genes 
assumed to have much later evolutionary origins are likely 
to have been present in an ancestral or early-diverged 
metazoan [43]. While much of the animal kingdom 
remains yet to be explored, examples of the metazoan 
phylum Cnidaria provide a unique insight into the deep 
evolutionary origins of at least some vertebrate gene fa- 
milies [42]. Thus, the complete genomic sequence of a 
coral is likely to reveal many genes previously assumed to 
be strictly vertebrate innovations. To date, cnidarian ge- 
nomes have been published for the sea anemone N. 
vectensis [42] and the hydroid Hydra magnipapillata [44]. 
Only the coral genome of Acropom digitifem is available 
without restriction on use of its published sequence [45], 
but the compiled sequence has not been fully annotated. 
At the time of this writing, the genome assembly of 
Acropom millepora has been released to the public do- 
main [46], also without full annotation, but an embargo is 
imposed on use of this data that is highly restrictive to the 
progress of further studies. Understanding how genomic 
variation affects molecular and organismal biology is the 
ultimate justification of genome sequencing, and annota- 
tion is an essential step in this process. We envisage that 
unrestricted access to annotation of the A. digitifera gen- 
ome will provide an unprecedented foundation to freely 
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interrogate the generic molecular structure, possible 
endobiotic interactions and the response of coral to en- 
vironmental stress. Accordingly, we offer annotation of 
the predicted proteome of A. digitifera on the open ac- 
cess and searchable database, ZoophyteBase [47]. Use 
of the ZoophyteBase search engines will allow genes of 
encoded proteins to be identified that can be examined in 
context of the cellular physiology, processes of ecological 
significance, the evolutionary and developmental biology 
of corals and the functional metabolism of the holobiont 
that collectively underpin the health of coral reefs. 

Construction and content 

ZoophyteBase is an open access and searchable database 
of complete annotation of the predicted proteome of the 
coral A. digitifera [48]. It was constructed using the ME 
GGASENSE system, which is a general system for cons- 
tructing annotation databases with different sorts of in- 
put data (DNA reads, assembled genomes, predicted 
proteomes) and the possibility of using different combi- 
nations of analysis tools to create the annotation (Gacesa 
et al, in preparation). In the case of ZoophyteBase, 
hidden Markov model (HMM) profiles [49] were chosen 
as the annotation tool rather than the more common 
BLAST searches [50]. HMM profiles are constructed 
from multiple alignments of protein families and contain 
information about conserved differences in amino acid 
residues as well as deletions and insertions [49]. This is 
particularly important for a coral database, as corals are 
evolutionarily distant to most other organisms. This 
means that known homologous sequences present in the 
databases will usually have relatively low similarity, mak- 
ing BLAST searches inaccurate. The statistical informa- 
tion in an HMM profile gives more sensitive and 
accurate detection of sequence homology. An additional 
advantage of HMM profiles is that the statistical signifi- 
cance of hits (the expected value) is much more accurate 
than that calculated by BLAST programs. 

The quality of sequence annotation is limited by the 
accuracy of information provided in any database used. 
It is well known that there are many problems with 
annotation in the large uncurated databases such as the 
NCBI GenBank nr sequences. Widely accepted, the most 
accurate database for functional annotation is the KEGG 
database [51]. The KEGG database organises sequences 
as groups of KEGG orthologues. These are sets of hom- 
ologous sequences from as wide a range of organisms as 
possible having an assigned molecular function. These 
functions are arranged in a hierarchical fashion and 
grouped in biological pathways. The sequences belong- 
ing to KEGG orthologues were used to construct HMM 
profiles for annotating the coral sequences. Accordingly, 
the 23,524 predicted proteins encoded in the coral ge- 
nome were analysed using HMM profiles. If a protein 



showed a highly significant correlation ("hit") to a single 
HMM profile, this was used to create a "trusted" annota- 
tion of the sequence. Choosing a cut-off for this criterion 
is not trivial, because longer sequences tend to have more 
significant e-values. For construction of ZoophyteBase the 
criterion le-5 was used. This resulted in 19,044 predicted 
proteins giving "trusted" sequence annotation. For many 
of these proteins there were two or more highly significant 
hits to established HMM profiles. In these cases, the most 
significant correlation was used to construct our "best-fit" 
annotation file, but other hits can be viewed by the data- 
base user so that expert knowledge can be employed to 
override the automatic annotation function. In 8,004 out 
of 19,044 predicted proteins which were annotated, more 
than one annotation was assigned based on non- 
overlapping regions within the protein which were used to 
construct the "best-fit" annotation file. We interpreted 
these as "fusion" events generated by the in silico protein 
prediction method used, and these proteins were treated 
as multiple instead of single encoded proteins. Hence, this 
analysis resulted in the annotation of 33,195 proteins in 
total, generated from the original 23,524 predicted coral 
proteins. This is a very conservative annotation scheme, so 
it can be assumed that most of the annotations are bio- 
logically meaningful. Almost 81% (19,044 out of 23,524) of 
the predicted proteome was assigned using this method. 

Utility 

The MEGGASENSE system was used to generate a web 
interface for ZoophyteBase. The home page (Figure 1A) 
allows the use of several functions. A text version of the 
entire annotation can be downloaded for manual inspec- 
tion. There is a proteome overview that gives statistics 
about the database and a breakdown of the annotated 
functions into different categories of genes. A particu- 
larly useful feature of ZoophyteBase is the ability to use 
text queries employing a search engine that provides a 
relevant inquiry in the absence of an exact match be- 
tween key words of a search and those described for a 
functional protein. The search engine uses text from the 
KEGG-database, PubMed and other sources to establish 
links between query words to access protein data using 
an intelligent Google-like search engine implemented by 
the search platform Lucene/Solr [52]. This helps to over- 
come the common problem that different terminology is 
used by different groups of researchers. The use of this 
search function is illustrated by using the query "pha- 
gocytosis" (Figure IB). This inquiry finds 42 hits to 
KEGG orthologue profiles. One of the hits corresponds 
to amphiphysin (a synaptic vesicle protein) with annota- 
tion of two protein homologues encoded in the coral 
genome. On the data page there is a brief description of 
the function of amphiphysin together with a PUBMED 
literature reference. The sequences of the predicted coral 
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Figure 1 Graphical overview of the user-web interface for ZoophyteBase during a typical search. The home page allows several search 
functions (A). Text queries using an intelligent Google-like search engine is illustrated by using the query "phagocytosis" (B). This finds 42 hits to 
KEGG orthologue profiles. One of the hits corresponds to amphiphysin with annotation of two protein homologues encoded in the coral 
genome. On the data page there is a brief description of the function of amphiphysin together with a PUBMED literature reference. The 
sequences of the predicted coral proteins can be retrieved (C). 



proteins (Figure 1C) can be retrieved, and it is also pos- 
sible to analyse such data with computer aided drug 
design methods [53] to look for conserved domains. 
There are also two tools for the user to examine matches 
to protein sequences. The user can carry out a BLAST 
search against the coral protein sequence or analyse the 
predicted sequence against HMM profiles used to anno- 
tate the coral proteome. These tools require only the 
user to paste their queury into the sequence window. 

In this manuscript we demonstrate the utility of 
ZoophyteBase by presenting predicted gene-encoded 
proteins revealed by annotation of the A. digitifera gen- 
ome that have physiological, biological and environmental 
significance. We discuss features of importance in coral 
physiology: (1) regulatory proteins of symbiosis, (2) planula 
and early developmental proteins, (3) neural messengers, 
receptors and sensory proteins, (4) calcification and Ca 2+ - 



signalling proteins, (5) plant-derived proteins, (6) proteins 
of nitrogen metabolism, (7) DNA repair proteins, (8) stress 
response proteins, (9) antioxidant and redox-protective 
proteins, (10) proteins of cellular apoptosis, (11) microbial 
symbioses and pathogenicity proteins, (12) proteins of viral 
pathogenicity, (13) toxins and venom, (14) proteins of the 
chemical defencesome and (15) coral epigenetics. 

Discussion 

Regulatory proteins of symbiosis 

Metabolic cooperation is a key feature of coral-algal 
symbiosis that allows reef-building corals to inhabit the 
often nutrient-poor waters of tropical oceans [54]. In 
this phototropic symbiosis, fixed carbon produced by 
resident algae is released to the host for nutrition, and 
the algal symbionts benefit by acquiring the inorganic 
nutrient wastes of host metabolism [2,55]. The symbiotic 
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dinoflagellates reside and proliferate within a specialised 
phagosome (the symbiosome) maintained within host 
gastrodermal cells. This arrangement requires complex 
biochemical coordination by the coral at various metabolic 
stages that includes endocytosis (phagocytosis) by post- 
settlement polyps to acquire algal symbionts, accord 
symbiosome recognition to arrest phagosomal maturation 
for sustained organelle homeostasis, activate symbiophagy 
or exocytosis to eliminate damaged symbionts [56,57], and 
regulate apoptotic or exocytotic pathways to remove 
excess or impaired populations, all of which have long 
been recognised as essential to preserve the stability of 
coral symbiosis [58]. Although these processes are poorly 
understood in corals, it has been realised from studies of 
the sea anemone Aiptasia pulchella, a related anthozoan 
also containing Symbiodinium sp. endosymbionts, that the 
persistence of algal-containing symbiosomes in Cnidaria 
relies on the exclusion or retention of small Rab GTPase 
family proteins that are key regulatory components of 
vesicular trafficking and membrane fusion in eukaryotic 
cells [59]. Significantly, ApRab3 and ApRab4 accumulate 
in the biogenesis of maturing symbiosomes of A. pulchella 
[60,61], and mature symbiosomes enveloping healthy di- 
noflagellates have tethered ApRab5 [62], a checkpoint 
antagonist of downstream ApRab7 and ApRabll proteins 
that would otherwise direct autophagy of the symbiont 
cargo [63,64]. 

Our annotation of the A. digitifera genome reveals 
sequences encoding putative Rab homologues of the Ras 
superfamily of proteins (Table 1). In a comparison of 
cnidarian Rab proteins, eight proteins of A. digitifera 
matched homologues of Aiptasia pulchella, twenty-nine 
matched proteins encoded by the aposymbiotic freshwater 
H. magnipapillata and the aposymbiotic anemone N. 
vectensis genomes, while seven Rab and Rab-interacting 
proteins of A. digitifera did not match other cnidarian pro- 
teins (Table 2). Significantly, the eight homologues of A. 
digitifera that matched exclusively Rab proteins of A. 
pulchella included homologues of the aforementioned 
ApRab3, ApRab4 and ApRab5 proteins attributed to the 
maintenance of healthy symbiosomes in Aiptasia, while 
homologues of the autophagic ApRab7 and ApRabll pro- 
teins are found also in N. vectensis. While Rab GTPase 
proteins and their effector proteins coordinate conse- 
cutive stages of endocytic vesicular transport [65,66], 
soluble N-ethylmaleimide-sensitive factor attachment re- 
ceptor (SNARE) proteins are essential for Rab assembly to 
complete endosomal fusion of vesicle membranes [67], a 
process by which Rab proteins impart specificity by bind- 
ing distinct Rab and SNARE partner proteins prior to 
membrane fusion [68]. Genes encoding syntaxin-like 
SNARE proteins have been unambiguously identified 
[69] from coral EST database libraries constructed from 
expressed mRNA isolated from various early life stages of 



Acropora aspera, A. millepora, A. palmata and Orbicella 
faveolata (= Monastraea faveolata), as well as from the 
genome of the sea anemone N. vectensis [70]. In meta- 
zoans, vacuolar r-SNARE receptor proteins comprise the 
syntaxin, synaptobrevin and VAMP family proteins, of 
which there are eight syntaxin and syntaxin-binding pro- 
teins (plus two plant-like syntaxins). Additionally, there 
are one t-SNARE target protein to direct vacuolar mor- 
phogenesis, two synaptosomal proteins, one synaptosomal 
complex ZIP1 protein (yeast homologue), one synap 
tobrevin membrane protein of secretory vesicles, ten 
vesicle-associated membrane proteins (VAMPs), a vacu- 
olar protein-8 regulator of autophagy, four vacuolar- 
sorting proteins and two SEC22 vesicle trafficking protein 
encoded in the genome of A. digitifera (Table 1), many 
of which may interact to provide metabolic transport 
between the endoplasmic reticulum and Golgi ap- 
paratus [71]. Included in this vast but yet unexplored 
repertoire of vacuolar-acting proteins are the syntaxin- 
binding amisyn and tomosyn regulators of SNARE com- 
plex assembly and disassembly [72,73], which may control 
membrane fusion in the phagocytic establishment and dis- 
sociation of coral symbiosis. 

In the final step of exocytosis there is a cytosolic influx 
of calcium which binds to synaptotagmin to actuate 
completion of membrane SNARE protein assembly with 
exocytic docking to form the conducting channel for 
trans-membrane vesicular transport on activation by 
vesicle-fusing ATPase [74]. As synaptotagmin proteins 
are not included in the KEGG database, Zoophytebase 
was used for BLAST searches with all known 
synaptotagamin sequences [27]. Synaptotagamin pro- 
teins from A. digitifera were found having similarity to 
homologues from diverse invertebrate and vertebrate or- 
ganisms, including one from the human genome 
(Table 3). Other Ca 2+ -sensing proteins of A. digitifera, 
such as calmodulin and the calcium binding protein 
CML, are given with calcification and Ca 2+ -signalling 
proteins. 

Intriguingly, annotation of the A. digitifera genome re- 
veals a host cell factor (K14966), but this is not related 
to the elusive "host factor" of symbiosis demonstrated to 
be present in tissue homogenates of corals and other 
marine invertebrates that harbor Symbiodinium spp. en- 
dosymbionts [75-77]. Instead, this mammalian transcrip- 
tional coactivator host cell factor (HFC-1) is known to 
mediate the enhancer-promoter assemblies of herpes 
simplex (HSV) and varicella zoster (VZV) viruses for ac- 
tivation of the latent state for replication [78], such that 
the coral HCF homologue may have similar relevance as 
a viral checkpoint transcriptional coactivator of viru- 
lence in A. digitifera. HCF-1 expression is coupled also 
to chromatin modification [79,80] suggesting that the 
coral protein homologue may have an additional role in 
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Table 1 Regulatory proteins of symbiosis in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


v1 .06849 


K06110 


Exocyst complex component 3 


v1. 00063; v1. 01 826 


K061 1 1 


Exocyst complex component 4 


v1 . 06336; vl. 06337; vl. 15354 


K07195 


Exocyst complex component 7 


v1 .04340 [+ 4 other sequence copies] 


K14966 


Host cell factor 


vl.01629; vl. 19166 


K12481 


Rabenosyn-5 


v1. 18447 [+ 26 other sequence copies] 


K07976 


Rab family, other (similar to Rab-6B) 


vl .02380 


K1 2480 


Rab GTPase-binding effector protein-1 


vl.01032 


K13883 


Rab-interacting lysosomal protein 


v1. 1 4682; vl. 03256; vl. 07709 


K1 2484 


Rabl 1 family-interacting protein-1/2/5 


v1. 13055; vl. 131 76; vl. 16348 


K1 2485 


Rabl 1 family-interacting protein-3/4 


vl.01275 


K07932 


Rab-like protein-2B 


v1. 17629 [+ 13 other sequence copies] 


K07933 


Rab-like protein-3 


v1. 03299; vl. 09653 


K07934 


Rab-like protein-4 


vl .08498 


K07935 


Rab-like protein-5 


vl .1 61 55 [+5 other sequence copies 


K07874 


Ras-related protein Rab-1A 


vl .09098 


K07875 


Ras-related protein Rab-1B 


v1. 1 3558; vl. 08983 


K07877 


Ras-related protein Rab-2A 


vl. 14260 


K07878 


Ras-related protein Rab-2B 


v1 . 07500; v1. 20532; vl. 07498 


K07884 


Ras-related protein Rab-3D 


v1. 21 242; v1. 07502 


K07880 


Ras-related protein Rab-4B 


vl.01341; vl.05619 


K07888 


Ras-related protein Rab-5B 


vl.07125 


K07889 


Ras-related protein Rab-5C 


vl .09239 


K07893 


Ras-related protein Rab-6A 


v1. 1 0443; vl. 13335 


K07897 


Ras-related protein Rab-7A 


vl .03086; vl .1 71 22; vl .07231 


K07916 


Ras-related protein Rab-7 L1 


vl .02275 [+ 4 other sequence copies] 


K07901 


Ras-related protein Rab-8A 


vl.24612 


K07899 


Ras-related protein Rab-9A 


vl .00411 


K07900 


Ras-related protein Rab-9B 


v1. 10697; vl.01515 


K07903 


Ras-related protein Rab-10 


v1 . 22278; v1. 04408; vl. 12528 


K07905 


Ras-related protein Rab-1 1 B 


v1. 07033; vl. 23028 


K07881 


Ras-related protein Rab-1 4 


vl .02275 


K07908 


Ras-related protein Rab-1 5 


v1. 16455; v1. 1491 1;v1. 14959 


K07910 


Ras-related protein Rab-1 8 


Vl .04714 


K0791 1 


Ras-related protein Rab-20 


vl .01 878; vl.12184 


K07890 


Ras-related protein Rab-21 


vl .09930 


K06234 


Ras-related protein Rab-23 


v1. 1 3579; vl. 12841 


K07912 


Ras-related protein Rab-24 


vl.10183 


K07913 


Ras-related protein Rab-26 


vl.08199 


K07885 


Ras-related protein Rab-27A 


v1. 1 3978; vl. 18893 


K07917 


Ras-related protein Rab-30 


v1 . 03085; v1. 06007; vl. 07729 


K07918 


Ras-related protein Rab-32 


vl .24721 


K07919 


Ras-related protein Rab-33A 


vl. 18892 


K07920 


Ras-related protein Rab-33B 
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Table 1 Regulatory proteins of symbiosis in the predicted proteome of A. digitifera (Continued) 



v1 .16060 


K07876 


Ras-related protein Rab-35 


vl.15894 


K07922 


Ras-related protein Rab-36 


vl .03080 


K07923 


Ras-related protein Rab-38 


vl.21391 


K07924 


Ras-related protein Rab-39A 


vl. 14786 


K07928 


Ras-related protein Rab-40 


v1 .05611 [+ 13 other sequence copies] 


K08502 


Regulator of vacuolar morphogenesis (t-SNARE domain) 


v1 .18253 


K08520 


SEC22 vesicle trafficking protein A/C 


vl. 15499 


K13814 


t-SNARE domain-containing protein 1 


vl .05749 


K08516 


Synaptobrevin homologue YKT6 


vl. 13229 


K12768 


Synaptonemal complex protein ZIP1 


vl .1 6533; vl .1 7141 


K08508 


Synaptosomal-associated protein, 23 kDa 


vl .05301 


K08509 


Synaptosomal-associated protein, 29 kDa 


vl. 19071 


K04560 


Syntaxin 1A 


v1. 0461 4; vl. 22747 


K08486 


Syntaxin 1 B/2/3 


vl. 16462 


K08490 


Syntaxin 5 


vl .20758; vl .21 534 


K08498 


Syntaxin 6 


v1. 22836; vl. 15499 


K08488 


Syntaxin 7 


v1. 01 959; vl. 24227 


K08501 


Syntaxin 8 


vl .02007; v1 .06683; v1. 12727 


K08491 


Syntaxin 17 


vl.21308; vl.1 1830; vl.01582 


K08492 


Syntaxin 18 


vl .221 00; vl .09457 


K08518 


Syntaxin binding protein 5 (tomosyn) 


vl.18555 


K08519 


Syntaxin binding protein 6 (amisyn) 


vl. 12938 


K08500 


Syntaxin of plants SYP6 


vl .06575 


K08506 


Syntaxin of plants SYP7 


vl. 14699 


K08507 


Unconventional SNARE in the endoplasmic reticulum protein 1 


v1 .23782 [+ 38 other sequence copies] 


K08332 


Vacuolar protein 8 


vl . 1 5282; vl. 24603; vl. 01 672 


K12196 


Vacuolar protein-sorting-associated protein 4 


v1 .1 7791 [+ 4 other sequence copies] 


K12479 


Vacuolar protein sorting-associated protein 45 


vl .20907 


K1 1 664 


Vacuolar protein sorting-associated protein 72 


vl.1 5996 [+ 5 other sequence copies] 


K12199 


Vacuolar protein sorting-associated protein VTA1 


vl .1 5614 


K08510 


Vesicle-associated membrane protein 1 (synaptobrevin) 


vl. 13353 


K13504 


Vesicle-associated membrane protein 2 (synaptobrevin) 


vl .12458; vl .07528 


K13505 


Vesicle-associated membrane protein 3 (cellubrevin) 


vl.1 9735; vl .2 1 83 1 ; vl .071 86 


K08513 


Vesicle-associated membrane protein 4 (Golgi transport) 


vl .05299 


K08514 


Vesicle-associated membrane protein 5 (exocytosis) 


v1. 1 3557; vl. 24610 


K08515 


Vesicle-associated membrane protein 7 (exocytosis) 


vl. 12279 


K08512 


Vesicle-associated membrane protein 8 (endobrevin) 


vl .00261; v1 .08699; v1 .04334 


K06096 


Vesicle-associated membrane protein A 


vl.201 77 


K10707 


Vesicle-associated membrane protein B 


vl. 1 5472; vl. 03568 


K06027 


Vesicle-fusing ATPase 


vl .1 1431; vl. 10487 


K08517 


Vesicle transport protein SEC22 


v1 .06393; vl .1 3003; v! .08735; vl .04261 


K08493 


Vesicle transport interaction with t-SNAREs 1 
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Table 2 Distribution of Rab homologues of Aiptasia 
puchella, Hydra magnipapillata and Nematostella 
vectensis in the predicted proteome of A. digitifera 



A. digitifera Rab protein Cnidarian encoding 
Rab homologue 



Rab-like protein- 2B, Rab-2B Rab-3D, Rab-4B, 


A. puchella 


Rab-5B, Rab-26, Rab-32, Rab-38 




Rab-like protein-3, Rab-36 


N. vectensis 


Rab-2A, Rab-23 


A. puchella, H. 




magnipapillata 


Rab-like protein-6B, Rab-6A, Rab-7 LI, Rab- 10, 


A. puchella, N. vectensis 


RabllB, Rab-30, Rab-33B 




Rab effector protein-1, Rab1 1 -interacting 


H. magnipapillata, N. 


protein-3/4 


vectensis 


Rab-like protein-4, Rab-like protein-5, Rab-IA, 


A. puchella, H. 


Rab5C, Rab-7A, Rab-8A, Rab-9A, Rab-14, Rab- 


magnipapillata, N. 


18, Rab-20, Rab-21, Rab-24, Rab-27A, Rab-35 


vectensis 


Rab-interacting lysomal protein, Rabll- 


No match 


interacting protein-1/2/5, Rab- IB, Rab-9B, 




Rab-3A, Rab-39A, Rab-40 





epigenetic reprogramming of the chromatin histone- 
DNA complex at different stages of development. 

Planula and early developmental proteins 

In this section we discuss predicted proteins encoded in 
the A. digitifera genome having functional homology to 
known proteins are specific to early embryonic develop- 
ment, planula larvae function and morphogenesis, which 
are given in Table 4. Annotation of the coral genome re- 
veals a large set of homeobox proteins involved in the 
regulation of anatomical development during morpho- 
genesis. The homeobox is a highly conserved DNA se- 
quence (homeodomain) within genes that binds to DNA 
in a sequence-specific manner [81] often at the pro- 
moter region of their target gene to affect transcription 
in the developing embryo. Amonst these transcriptional 
regulators, Hox genes are essential to metazoan develop- 
ment as their expressed proteins differentiate embryonic 
regions along the anterior-posterior axis (the Hox code) 
and are recognised for their contribution to the evolu- 
tion of morphological diversity [82]. Hox genes are well 
characterised in cnidarians and, given their importance 
in embryonic development, it is not surprising that mo- 
lecular evidence from the Cnidaria reveal that the gen- 
etic origins of Hox genes predate the cnidarian-bilaterian 
divergence [83-85] yet had evolved after divergence of 
the sponge and eumetazoan lineages [86]. Hox genes of 
cnidarians are typically located in a conserved genomic 
collinear cluster, which is apparent also for A. digitifera, 
whereby the order of the genes on the chromosome is 
the same as that of gene expression in the developing 
embryo. Included in our annotation are genes encoding 



Table 3 Synaptotagmin proteins in the predicted 
proteome of A. digitifera 



Gene sequence 


GenBank 
Accession 


Genome encoded 
homologue 


v 1.08623 


GL268530614 


Caenorhabditis briggsae: 
XP_002630433 (worm) 


vl .20682; vl .10560; 
v1 .02080; vl.10015 


Gl:1 5041 6761 


Platynereis dumerilii: 
ABR68850 (worm) 


vl .10269; vl.04412 


Gl:288869516 


Nasonia vitripennis: 
NP_001 165865 (wasp) 


vl .01508 


Gl:29378331 


Lymnaea stagnalis: 
AA093847 (snail) 


vl. 18613 


GL-391339919 


Metaseiulus occidentalis: 
XP_003744294 (mite) 


v 1.07402 


Gl:260834895 


Branchiostoma floridae: 
XP_002612445 (lancelet) 


v1 .01542 


Gl:1 49067023 


Rattus norvegicus: EDM 16756 
(rat) 


v1 .20683 


Gl:383860584 


Megachile rotundata: 
XP_003705769 (bee) 


vl. 17688 


GI48529130 


Oreochromis niloticus; 
XPJXB452067 (fish) 


v1. 15777; vl. 14902 


Gl:269785031 


Saccoglossus kowalevskii: 
NP_001 161667 (worm) 


v1 ,171 75; v1 .11521 


Gl:1 1 55931 3 


Halocynthia roretzi: 
BAB18864 (ascidian) 


v.1. 03344; vl. 03345 


Gl:1 265841 9 


Manduca sexta; AF331039 
(moth) 


vl. 16152 


GL3957291 92 


Pongo abelii: XP_003780414 
(orangutan) 


vl. 10268 


Gl:327283049 


Anolis carolinensis: 
XP_003226254 (lizard) 


v1 0.2778 


Gl:1 25984480 


Drosophila pseudoobscura 
XP_001 356004.1 (fly) 


v1 .02083; vl. 02777 


GL-226490194 


Schistosoma japonicum: 
CAX69339.1 (fluke) 


vl .04326 


Gl:1 67744962 


Homo sapiens: 2R83_A 
(human) 


v1. 14682; vl.04180 


Gl:241 704658 


Ixodes scapularis: 
XP_00241 1 967 (tick) 



two LIM homeobox proteins and a LIM homeobox tran- 
scription factor (Lhx) having conserved roles in neuronal 
development [87], which in N. vectensis are responsible for 
the development of neural networks in developing larvae 
and juvenile polyps [88]. Unlike N, vectensis [89], the coral 
genome expresses a homeobox BarH-like protein that in 
vertebrates directs neurogenesis [90]. Distinct from 
homeodomain proteins, but serving similar functions, 
are various protein activators, regulators and receptors 
of cellular morphogenesis. Annotation of the coral gen- 
ome has revealed multiple sequence alignments to a pro- 
tein homologue of the dishevelled-associated activator of 
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Table 4 Planula and early developmental proteins in the predicted proteome of A. digitifera 



Gene sequence KEGG Orthology Encoded protein description 



v1 .09797; v1 .11 180; vl.08414 


K03776 


v1 .07838 [+5 other sequence copies] 


K07822 


vl.14039; vl.1 1310; vl.1 1309 


K05502 


v1 .01 025; v1 .1 7008; vl .1 5796; v1 .23658 


K04662 


vl .02299; vl .07696; vl.1 0675 


K04663 


v1. 06335; v1. 01 763 


K04673 


vl.13481 


K13578 


vl .1 0550 [+4 other sequence copies] 


K04671 


v1.00912 [+4 other sequence copies] 


K13579 


vl.1 9370 


K14624 


vl.23163 


K1 2499 


vl .08576 


K05511 


vl .09229 


K05512 


vl .09305 


K08373 


vl .04942 


K041 79 


vl .02658 


K04245 


vl.21300 


K12671 


vl.16396; vl.21991 


K10035 


vl.23712 


K11522 


vl .09435 


K1 3490 


vl. 14142; vl .05300 


K05874 


vl .07361 


K05877 


Vl.1 7411 


K03414 


vl.16104 


K00575 


v1 .15537 [+ 7 other sequence copies] 


K08482 


v1 .14925 [+ 4 other sequence copies] 


K02223 


v1 .06432 [+ 9 other sequence copies] 


K04512 


vl .17637 [+ 70 other sequence copies] 


K1 0408 


v1 .00202 [+5 other sequence copies] 


K1 0409 


vl .04986; v1 .09649; v1. 23645 


Kill 43 


vl .08695; v1. 09481; vl.23153 


K10411 


vl.1 1684 


K10412 


vl.23322; vl.01 131; vl.04207 


K10410 


vl 14083 

V 1 . I TWUJ 


K02401 


vl.1 6997 


K02420 


vl .02867 


K02396 


vl.18101; vl.13427 


K02408 


v1. 04339; vl. 07633 


K06603 


v1.17895[+5 other sequence copies] 


K02383 


vl .21 1 1 1 


K02413 


vl .1 765 1 [+ 13 other sequence copies] 


K0241 5 


vl.01 971 [+ 6 other sequence copies] 


K0241 8 


vl. 14031 


K02423 



Aerotaxis receptor (oxygen sensing) 

Archaeal flagellar protein FlaC 

Bone morphogenetic protein 1 

Bone morphogenetic protein 2/4 

Bone morphogenetic protein 5/6/7/8 

Bone morphogenetic protein receptor type-1 A 

Bone morphogenetic protein receptor type-IB 

Bone morphogenetic protein receptor type-2 

Bone morphogenetic protein receptor type-1, invertebrate 

C-C motif chemokine 2 

C-C motif chemokine 5 

C-C motif chemokine 15/23 

C-C motif chemokine 19/21 

C-C chemokine receptor-like 2 

C-C chemokine receptor type 4 

Chemokine-like receptor 1 

C-X-C motif chemokine 10 

C-X-C motif chemokine 16 

Chemotaxis family two-component system response regulator PixG 

Chemotaxis family, histidine kinase sensor response regulator (WspE-like) 

Chemotaxis protein I, serine sensor receptor (MCP family) 

Chemotaxis protein IV, peptide sensor receptor (MCP family) 

Chemotaxis protein CheZ 

Chemotaxis protein methyltransferase CheR 

Circadian clock protein KaiC 

Circadian locomoter output cycles kaput protein 

Dishevelled associated activator of morphogenesis 

Dynein heavy chain, axonemal 

Dynein intermediate chain 1, axonemal 

Dynein intermediate chain 2, axonemal 

Dynein light chain 1, axonemal 

Dynein light chain 4, axonemal 

Dynein light intermediate chain, axonemal 



Flagel 


ar biosynthetic protein FlhB 


Flagel 


ar biosynthetic protein FliQ 


Flagel 


ar hook-associated protein 1 FlgK 


Flagel 


ar hook-basal body complex protein FliE 


Flagel 


ar protein FlaG 


Flagel 


ar protein FlbB 


Flagel 


ar protein FliJ 


Flagel 


ar protein FliL 


Flagel 


ar protein FliO/FliZ 


Flagel 


ar protein FliT 
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Table 4 Planula and early developmental proteins in the predicted proteome of A. digitifera (Continued) 



V 1 .UoUzj 


!\UZ3y i 4- 


F 3cje Isr P-rinq protein precursor Flql 


v i .uz jyo, v i . i j / / / 


txuz^fuy 


rldyeildi IVl inly pruiclM rllr 


v i .zuoyj 


txuy^fD 1 


Homeobox protein srist3 ess- ike 4 


v1 .24732 [+5 other sequence copies] 


ixuy^oz 


Homeobox protein 3rist3 ess-relstecl 


V 1 . 1 j/oo, V 1 . 1 y334, V 1 ,U4 1 o4 


i\uy3 1 3 


Homeobox protein cut- ike 


V 1 .U 1 oU 1 


i\uy3 1 y 


Homeobox protein engrailed 


VI.I Do J D, V I .UOjzj 


i\uy3zu 


Homeobox even-skipped homoloque protein 


wi C\A 1 3- w1 n^/1771 

V I .1)4 I z, V I .Uj4/ / I 


t\uy334 


Homeobox protein expressed in ES cells 1 


wi 1 ~i £,r\A 
VI.I 3DU4 


i/ntm 3/1 

i\uy3z4 


Homeobox protein qoosecoid 


V 1 .U0340, V 1 .Uo 1 Dj 


wjyjzD 


Homeobox protein qoosecoid-like 


\/1 171QI;- w1 1 7 TO/1 

v i . i /zyj, v i . i / zy4 


i\uy3o 1 


Homeobox protein, BarH-like (vertebrate neurogenesis) 


V I .U/4j/ 


i\uy3 1 0 


Homeobox protein DLX, invertebrate 


w1 111 Ql- w1 r\QC7'3- w1 1 Q1Qt~\ 

VI.I 1 Id/, VI .Uo j / j, V 1 . 1 jzjU 


i\uy3 1 / 


Homeobox protein EMX 


V I .U I oUU 


ixuyjz 1 


Homeobox protein GBX 


v i . i uyzy, v i .uoj4o, v i .10443, v 1 .u/430 


ixuyj i u 


Homeobox protein GSH 


\/1 1 1£.QA- w1 1AAAA 
VI.I 30o4, V 1 .Z4444 


l\U0UZ3 


Homeobox protein HB9 


V 1 . 1 0Z34, V 1 . 1 DUD4 


l\U0UZ4 


Homeobox protein HEX 


v I .U/43o, V I .UD/UO, v I .U0/U3 


i\uy33y 


Homeobox protein HLX1 


V 1 .U034/, V 1 .UD34o, V 1 . 1 /Zy4 


ivjyjuz 


Homeobox protein HoxA/B2 


V I .UO I ZD 




numeouux pioiein nuxA/ d/lo 


w1 1 OQ 1 Q 

vi.i yo 1 0 


i\uy3U4 


Homeobox protein HoxA/B/C/D4 


wi nA7riA 

V I .UD/UD 


ivuyju i 


Homeobox protein HoxA/B/D1 


V I .UzUjo 


Wjy333 


Homeobox protein LBX 


V I .UD34/, V 1 .U034o 


i\uy3zo 


Homeobox protein Unc-4 


V 1 .Z434Z, V 1 .U43 jZ 


i\uy3 I O 


Homeobox protein ventral anterior 


V I .Ujozj, VI.I UU/U, V I .U4433 


ixuyjuy 


Homeobox protein Nkx-1 


v1. 12852 [+ 4 other sequence copies] 


ixUoUzy 


Homeobox protein Nkx-2.2 


V 1 .Z 1 OjU 


l\Uy343 


Homeobox protein Nkx-2.5 


VI.I UDZ J 


i\uy34/ 


Homeobox protein Nkx-2.8 


wi 1 nA")^' \/i 1 3D id nc/i7fi 
VI.I UOZj, VI.I jooj, V I .Uj4/0 


i\uy34o 


Homeobox protein Nkx-3.1 


V I .Z I OZo, V I .U34/3, V I .Uj4/ / 


i\uyyy3 


Homeobox protein Nkx-3.2 


wi n^i 1 \ /1 1 nn7 1 

V I .Uu I 3D, VI.I UU/ I 


i\uy34y 


Homeobox protein Nkx-5 


\/i 1 zi vm 

VI.I 4/Uz 


l\U0U3U 


Homeobox protein Nkx-6.1 


\/1 1 4Q1 7- \/1 1 1 QD7 
VI.I l -ry I / , VI.! I :?U/ 


[\uy33u 


nUlllcUIJUA piULfcrlll l*JKA O.Z 


wi nn777- \/i Di/ic;^ 

V 1 .UU/ / /, V 1 .Z 1 43 D 


ixuyjzz 


Homeobox protein MOX 


v1 .00602 [+ 6 other sequence copies] 


ixuyjzo 


Homeobox protein OTX 


VI.I 0/ ZZ, VI.I Z/03 


i\uy3 / 4 


LIM homeobox protein 3/4 


v1.1 1281; vi .051 35 


K09375 


LIM homeobox protein 6/8 


v1. 07988; v1. 22037 


K09371 


LIM homeobox transcription factor 1 


v1 .09328 [+ 5 other sequence copies] 


K10394 


Kinesin family member 3/17 


v1. 091 96; v1. 12479 


K11525 


Methyl-accepting chemotaxis protein PixJ (MCP family) 


v1. 1 7028; v1. 13473 


K08473 


Nematode chemoreceptor 


v1. 131 59; v1. 00655 


K09330 


Paired mesoderm homeobox protein 2 


v1 .1 51 78; v1 .1 0962; v1 .1 6587; v1 .01 557 


K02633 


Period circadian protein 


v1 .23288; vi. 13857 


K04627 


Pheromone a factor receptor 
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Table 4 Planula and early developmental proteins in the predicted proteome of A. digitifera (Continued) 



v1. 22464; vl. 171 35 


K11213 


Pheromone alpha factor receptor 


v1 .05611 [+ 13 other sequence copies] 


K08502 


Regulator of vacuolar morphogenesis 


v1 .04431 


K09333 


Retina and anterior neural fold homeobox-like protein 


vl.17636 


K09331 


Short stature homeobox protein 


vl. 14704 


K09340 


T-cell leukemia homeobox protein 


vl.11765 


AM 1 


Tektin 


vl.04154 


K02669 


Twitching motility protein PilT 



^NA KEGG orthology designation not assigned. 



morphogenesis 1 (Daaml) that initiates cytoskeleton for- 
mation via the control of actin assembly. Daaml was 
found crucial for gastrulation in Xenopus [91], wherein 
Daaml mutants of Drosophilia exhibit trachea defects 
[92], and in mammals Daaml is highly expressed in mul- 
tiple developing organs and is deemed essential for car- 
diac morphogenesis [93]. Similar morphogenetic genes 
express regulatory proteins that are necessary for vacuole 
biogenesis in yeasts [94]. Others express bone morpho- 
genetic proteins (and their BMP receptors), which are po- 
tent multi-functional growth activators that belong to the 
transforming growth factor beta (TGFbeta) cytokine 
superfamily of proteins that in humans have various func- 
tions during embryogenesis, skeletal formation, neurogen- 
esis and haematopoiesis [95]. However, since many of the 
homeobox and morgenetic proteins (Table 4) are homo- 
logues of proteins with functions ascribed to higher organ- 
isms, their precise function in A. digitifera cannot be 
ascertained by KEGG orthology alone. 

Another protein encoded in the A. digitifera genome 
is a retina and anterior neural fold homeobox-like 
(RAX) protein that may activate the development of 
primitive coral photoreceptors [96,97], including a blue 
light-sensing, cryptochrome photoreceptor that in A. 
millepora is implicated in the detection of light from the 
lunar cycle of night time illumination to signal synchron- 
ous coral spawning [98,99]. Photosensitive behaviours and 
the circadian rhythms of corals are well described, and di- 
urnal cycles of gene transcription that regulate circadian 
biological processes in the coral A. millepora have been 
reported [100]. Such traits in A. millepora appear regu- 
lated by an endogenous biological clock entrained to daily 
cycles of solar illumination [101]. Annotation of the A. 
digitifera genome reveals a circadian timekeeper protein 
KaiC [102] that in cyanobacteria is activated during the di- 
urnal phosphorylation rhythm [103,104]. In Synechococcus 
elongatus, KaiC regulates the rhythmic expression of all 
other proteins encoded in the genome [105], yet no 
homologue of any of the prokaryotic clustered circadian 
kiaABC genes has been identified in eukaryotes [106]. In 
Drosophila, KaiC together with a homologue of the 
eukaryotic period (Per) circadian protein drives circadian 



rhythms in eclosion (hatching) and locomotor activity 
[107]. Nevertheless, a circadian locomotor output cycles 
kaput (CLOCK) homologue (Table 4) was found in our 
annotation. Since CLOCK proteins serve as an essential 
activator of downstream elements in pathways critical to 
the regulation of circadian rhythms in eukaryotes [108], it 
would be worthy to examine how transcription of the 
RAX-like homeobox protein in this coral contributes to 
the development of circadian functions by activation of 
kaiC, per and Clock genes. Such a study might reveal that 
components of the animal circadian clock are more an- 
cient than data previously suggested [109]. 

Broadcast-spawning corals, such as A. digitifera, re- 
lease gametes, and the fertilised eggs develop into pla- 
nula larvae within the water column until they have 
reached settlement competency, find a suitable hard sub- 
strate, attach and develop into the polyp on metamor- 
phosis. Coral sperm and planula larvae achieve motility 
using flagella (sperm) or cilia (larvae) as their locomotor 
organelles. The eukaryotic axonemal proteins of cilia 
and flagella are composed of a dynein ATPase protein to 
provide mechanochemical energy transduction together 
with the principle structural proteins of the ciliary/flagel- 
lar microtubules [110]. The flagellar/ciliary microtubules 
consist of filaments composed of a- and (3-tubulins, 
microtubule-stabilising tektins and kinesin motor pro- 
teins [111-113]. The coral genome encodes members of 
the dynein axonemal (flagella and cilia) proteins (Table 4) 
and many of the dynein cytoplasmic proteins (not tabu- 
lated), the latter being involved in intracellular organelle 
transport and centrosome assembly. The coral genome 
encodes a- and (3-tubulins and members of the eukaryotic 
kinesin superfamily proteins (not tabulated). Amongst the 
many kinesin proteins encoded in the coral genome is the 
kinesin family member 3/17 protein, which is a direct 
homologue of the kinesin-II intraflagellar transport pro- 
tein FLA10 essential for flagella assembly in the alga 
Chlamydomonas [114]. The microtubule-stabilising tektin 
protein, which is required for cilia and flagella assembly 
[113], is also encoded in the coral genome [note: there is 
no KEGG orthology identifier assigned to this protein]. It 
was a surprise, however, to find a large complement of 
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prokaryotic flagellar proteins encoded in the coral genome 
consisting of archaeal flagellar (FlaC and FlaG), bacterial 
filament (FibB, FlliE, FliF, FliJ, FliK, FliO/FliZ, FliQ and 
FliT) homologue components (Table 4). Included also are 
the prokaryote homologues FlgN and FlbB that regulate 
transcriptional activation of flagellar assembly [115,116] 
and FlhB which controls the substrate specificity of the en- 
tire prokaryotic flagellar apparatus [117]. Encoded in the 
coral genome is a flagella-independent Type IV twitching 
mobility protein PilT that affords social gliding transloca- 
tion in many prokaryotic organisms controlled by complex 
signal transduction systems that include two-component 
sensor regulators [118]. It is unlikely that these genes are 
derived from contamination from bacterial DNA. Such 
contamination would manifest itself by the random occur- 
rence of bacterial genes from the whole genome including 
many housekeeping genes. In this case, the genes occur as 
members of groups with specialised functions, suggesting 
that multiple horizontal gene transfers between bacteria 
and the coral genome have occurred [119]. Their precise 
function in A. digitifera remains unknown; homologues of 
these prokaryotic genes have not been described previ- 
ously in any other eukaryote genome. 

Linked closely with flagellar/ciliary proteins are the sen- 
sory receptors that signal chemoattraction or avoidance to 
direct cellular motility. The coral genome reveals a variety 
of genes that encode chemoreceptor and chemotaxis 
proteins (Table 4). The chemoreceptor proteins of A. 
digitifera include an oxygen-sensing aerotaxis receptor 
that in bacteria invokes an avoidance response to anoxic 
micro-environments [120]. Encoded also are a nematode 
sensory chemoreceptor homologue [121], two homo- 
logous pheromone factor receptor proteins that in fungi 
activate a species-specific mating response [122], three 
chemotaxis protein sensor receptors belonging to the 
methyl-accepting chemotaxis family of proteins (MCPs) in 
bacteria and archaea [123], and two proteins (CheZ and 
CheR) and two regulators (PixG and WspE) of the two- 
component signal transduction (TCST) system for activa- 
tion of gene expression. In bacteria and archea, as well as 
some plants, fungi and protozoa [124], TCST systems me- 
diate many cellular processes that respond to a broad 
range of environmental stimuli via activation of a specific 
histidine (or serine) kinase sensor and its cognate response 
regulator [125]. There are 77 sequence matches to various 
elements of the TCST family of proteins in the A digitifera 
genome (data not tabulated). Included also are genes en- 
coding members of the chemotactic cytokine (chemokine) 
family of sensory proteins that on secretion directs 
chemotaxis in nearby responsive cells by stimulating target 
chemokine receptors; both chemokine and chemokine 
receptor proteins are encoded in the coral genome. Sig- 
nificantly, sensory chemokines/chemokine receptors are 
found in all vertebrates, some viruses and some groups of 



bacteria, but none have been described previously for in- 
vertebrates [126]. 

Neural messengers, receptors and sensory proteins 

Corals and other cnidarians are the earliest extant group 
of organisms to have a primitive nervous system network 
[127] thought to be evolved from a eumetazoan an- 
cestor prior to the divergence of Cnidaria and the 
Bilateria [128,129]. Unlike marine sponges (Porifera) 
that predate synaptic innovation [130], cnidarians possess 
a homogenous nerve net that, although lacking any form 
of cephalization, accommodates fundamental neurosen- 
sory transmission across the nerve net to end in a moto- 
neural junction to coordinate tentacle movement required 
for feeding and predator avoidance [131]. The nervous 
systems of cnidarians consist of both ectodermal sensory 
cells and their effector cells and endodermal multipolar 
ganglions capable of neurotransmission [132]. At the func- 
tional level, synaptic transmission in cnidarians relies on 
fast neurotransmitters (glutamate, GABA, glycine) and 
slow neurotransmitters (catecholamine, serotonin, neuro- 
peptides) for sensory-signal conduction [133]. At the 
ultrastructural level, many cnidarian neurons have multi- 
functional traits of sensory, neurosecretory and stimula- 
tory attributes [134]. Significantly, the genome of A. 
digitifera encodes the expression of a ciliary neurotrophic 
factor, which is a polypeptide hormone and nerve growth 
factor that promotes neurotransmitter synthesis, neurite 
outgrowth and regeneration [135]. Additionally, the coral 
genome encodes nerve growth factor and neurotrophic 
kinase receptors, a survival motor neuron protein, a sur- 
vival neuron splicing factor, the neural outgrowth protein 
neurotrimin, and a neurotrophin growth factor attributed 
to signalling neuron survival, differentiation and growth 
(Table 5). Encoded for neuron regulation and development 
are several neuron cation-gated channels, a neuronal gua- 
nine nucleotide exchange factor, a neurotransmitter Na + 
symporter, several neurogenic differentiation proteins, a 
neuronal PAS domain transcription factor for activation of 
neurogenesis, the axon guidance protein neurophilin-2, a 
neural crest protein of embryonic neural development, 
neural ELAV-like transcription proteins of neurogenesis, a 
Notch protein (79 sequence domain matches) and a neu- 
tralized protein subset of the Notch signalling pathway 
that promotes neuron proliferation in early neurogenic 
development. Structural elements of the coral nerve net 
include neurofilament polypeptides and neuronal adhesion 
proteins. 

Cnidarians differentiate highly specialised sensory and 
mechanoreceptor cells involved in the capture of prey 
and for defence against predators. Their stinging cells, 
termed nematocysts or cnidocytes, are stimulated by ad- 
jacent chemosensory cells. Nematocysts trigger the re- 
lease of a stinging barb (cnidae tubule) via ultra-fast 
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exocytosis on physical contact with ciliary mechano- 
receptors of the cnidocyte to deliver the discharge of its 
venom [136]. Despite considerable advances in the sen- 
sory biology of cnidarians, knowledge of the specific 
receptor genes that regulate cnidocyte function remains 
incomplete. In Hydra, and perhaps other cnidarians, cni- 
docyte discharge is controlled by an ancient light- 
activated, opsin-mediated phototransduction pathway 
[137] that precedes the evolution of cubozoan (box jelly- 
fish) eyes [138]; cubozoans are the most basal of animals 
to have eyes containing a lens and ciliary-type visual cells 
similar to that of vertebrate eyes [139]. These G-coupled 
opsin photoreceptors of the retinylidene-forming protein 
family encoded in the genome of A. digitifera include 
rhodopsin, bacteriorhodopsin, c-opsin, r-opsin and G 0 - 
opsin (Table 5), but not the Gs-subfamily of opsin recep- 
tors reported to be present in sea anemones, hydra and 
jellyfish [140], that together with cyclic nucleotide-gated 
(CNG) ion channel proteins, arrestin ([3-adrenergic recep- 
tor inhibitor) and other retino-protein receptors, are usual 
components of the bilaterian phototransduction cascade. 
Present also are genes to express rhodopsin kinase and [3- 
adrenergic receptor kinase which are related members of 
the serine/ threonine kinase family of proteins that specif- 
ically initiate deactivation of G-protein coupled receptors. 
Additional proteins of retinol metabolism of the photo- 
transduction pathway encoded in the A. digitifera genome 
are retinol dehydrogenase, all-ira«s-retinol 13,14 reduc- 
tase and phosphatidylcholine (lichthin)-retinol O-acyltrans 
ferase, a neural retina-specific leucine zipper protein 
that is an intrinsic regulator of photoreceptor develop- 
ment and function, and a retina and anterior neural fold 
homeobox-like protein that modulates the expression of 
photoreceptor genes within the rhodopsin promoter. 
The genome of A. digitifera encodes also a blue light- 
sensing, cryptochrome photoreceptor thought to signal 
synchronous coral spawning by detecting illumination 
from the lunar cycle [98,99] . 

The A. digitifera genome reveals genes to express a broad 
array of neurotransmitter receptor proteins (Table 5), in- 
cluding glycine and glutamate neuroreceptors, adrenergic 
receptors that target non-dopamine catecholamines (i.e., 
epinephrine and norepinephrine), dopamine, muscarinic 
and nicotinic acetylcholine receptors, sensory G protein- 
coupled receptors and y-aminobutyric acid (GABA) 
ligand-gated ion channel and G protein-coupled recep- 
tors (and inhibitors), several of which are encoded in 
high copy numbers. Cellular trafficking of neurotrans- 
mitters to presynaptic terminals is essential for neuro- 
transmission, and significantly the genome of A. digitifera 
encodes a wide range of solute carrier neurotransmitter 
transporters, including a high affinity choline trans- 
porter and an acetylcholine-specific protein belonging 
to the major facilitator superfamily (MFS) of secondary 



transporters. Encoded also is dopamine (3-monooxyge- 
nase that catalyses the conversion of dopamine to nor- 
epinephrine in the catecholamine biosynthetic pathway, 
which is necessary for cross-activation of adrenergic 
neuroreceptors [141]. Notably, the A. digitifera genome 
encodes acetylcholinesterase that is expressed at neuro- 
muscular junctions and cholinergic synapses where its 
protease activity serves to terminate synaptic transmission. 

The primitive nervous networks of cnidarians are 
strongly peptidergic with at least 35 neuropeptides iden- 
tified from different cnidarian classes [142]. Our an- 
notation of the sequenced A. digitifera genome, however, 
revealed only the neuropeptide FF-amide neurotransmit- 
ter, a RF amide related peptide, and its neuropeptide FF 
and Y receptors (Table 5). Neuropeptides are usually 
expressed as large precursor proteins which comprise 
multiple copies of "immature" neuropeptides. Our an- 
notation did not readily reveal these precursor neuro- 
peptide proteins, but we did find enzymes required for 
their processing, for example, a variety of carboxypeptidase 
enzymes (not tabulated) that remove propeptide carboxyl 
residues at basic peptidase sites, and the mature peptide 
neurotransmitters that are finished by consecutive modifi- 
cation by peptidylglycine (oc-hydroxylating) monooxidase 
(PHM) and peptidyl a-hydroxyglycine a-amidating lyase 
(PAL) enzymes, both of which are commonly expressed in 
mammals as a single bifunctional peptidylglycine monooxy 
genase (K00504/EC 1.14.17.3) [143]. Our extensive cata- 
logue of animal-like neural and sensory proteins revealed 
by genome annotation is testament that essential neuro- 
biological features were developed in the primitive neural 
networks of early eumetazoan evolution. 

Calcification and Ca 2+ -signalling proteins 

The massive structures of coral reefs evident today are a 
construction of aggregated calcium carbonate deposited 
over long geological time by scleractinian corals and 
other calcifying organisms, yet our understanding of the 
molecular processes that regulate the biological pro- 
cesses of coral calcification is limited [144]. Ca + transfer 
from seawater to the calicoblastic site of coral calcifica- 
tion occurs by passive diffusion through the gastrovas- 
cular cavity [145] and by active calcium transport [146]. 
Active entry of Ca 2+ through the oral epithelial layer is 
regulated by voltage-dependent calcium channels, such 
as demonstrated by the L-type alpha protein cloned 
from the reef-building coral Stylophora pistillata [147]. 
Ca 2+ transport across the calioblastic ectoderm to the 
extracellular calcifying site is facilitated by the plasma- 
membrane ATP-dependent calcium pump that in S. 
pistillata resemble the Ca 2+ -ATPase family of mammalian 
proteins [148]. By 2H + /Ca 2+ -exchange at the calioblastic 
membrane, Ca 2+ -ATPase removes H + (from the net re- 
action Ca 2+ + C0 2 + H 2 0 => CaC0 3 + 2H + ) thereby 
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Table 5 Neuronal and sensory proteins in the predicted proteome of A. digitifera 



Gene sequence KEGG Orthology Encoded protein description 



v1 .01 91 8 [+ 5 other sequence copies] 


K01049 


vl .18087; vl.14516 


K04136 


vl .06394 


K04137 


v1 . 09628; vl. 1 5688; vl. 00966 


K04140 


vl. 19831; vl .20450 


K04142 


v. 17293 


K00910 


v1 .13740 [+ 5 other sequence copies] 


K04828 


vl .23541 [+ 6 other sequence copies] 


K04829 


v1 .09323 [+ 4 other sequence copies] 


K04439 


vl .07723; vl. 22465 


K04641 


v1 .08062 


K05420 


vl .03288 [+ 5 other sequence copies] 


K02295 


v1 .2001 1 ; v1 .20036; v1 .20084; v1 .1 8607 


K04948 


V1.21470 


K04951 


v 1 .2 1 783; v1 .0 1 466; v 1 .0 1 466; vl .0 1 466 


K05326 


vl .03645 


K05391 


vl.21256 


K08762 


v1 .221 56 [+ 6 other sequence copies] 


K00503 


v1. 21 775: vl. 15989 


K04148 


vl. 14160; vl.01697 


K04144 


vl .05089; vl .2001 8 


K04145 


v1. 14030; vl .23273 


K04146 


vl .20536 


K13088 


vl. 18658 [+ 5 other sequence copies] 


K13208 


v1 .05774 [+ 18 other sequence copies] 


K04313 


v1. 00572; vl. 181 52 


K08404 


vl .23842 


K04316 


vl .03948 


K0841 1 


vl .09271 


K08383 


vl .05595 


K04243 


vl.04019 


K08409 


v1. 1 991 3; v1. 09821 ;v1. 04291 


K08450 


vl .05404 


K04321 


vl .021 79; vl. 10397 


K08451 


v1 .23269 [+ 5 other sequence copies] 


K08408 


vl.21091 


K08421 


v1. 11008 


K04302 


vl .21884; vl.01951 


K08452 


v1 .03243 [+ 13 other sequence copies] 


K08378 


vl. 1 3790; vl. 18939 


K08453 


v1. 09442; vl. 1401 9 


K08455 


vl .24009 


K08456 


vl .04290 


K08459 



Acetylcholinesterase 
Adrenergic receptor alpha-IB 
Adrenergic receptor alpha-ID 
Adrenergic receptor alpha-2C 
Adrenergic receptor beta-2 
beta-Adrenergic-receptor kinase 

Amiloride-sensitive cation channel 1, neuronal (degenerin) 

Amiloride-sensitive cation channel 2, neuronal 

beta-Arrestin 

Bacteriorhodopsin 

Ciliary neurotrophic factor 

Cryptochrome 

Cyclic nucleotide gated channel alpha 1 

Cyclic nucleotide gated channel alpha 4 

Cyclic nucleotide gated channel, invertebrate 

Cyclic nucleotide gated channel, other eukaryote 

Diazepam-binding inhibitor (GABA receptor, acyl-CoA-binding protein) 

Dopamine beta-monooxygenase 

Dopamine Dl-like receptor 

Dopamine receptor D1 

Dopamine receptor D2 

Dopamine receptor D3 

ELAV-like protein 1 

ELAV-like protein 2/3/4 

G protein-coupled receptor 6 

G protein-coupled receptor 17 

G protein-coupled receptor 19 

G protein-coupled receptor 26 

G protein-coupled receptor 34 

G protein-coupled receptor 37 (endothelin receptor type B-like) 

G protein-coupled receptor 45 

G protein-coupled receptor 56 

G protein-coupled receptor 63 

G protein-coupled receptor 64 

G protein-coupled receptor 68 

G protein-coupled receptor 84 

G protein-coupled receptor 85 

G protein-coupled receptor 97 

G protein-coupled receptor 103 

G protein-coupled receptor 1 10 

G protein-coupled receptor 1 12 

G protein-coupled receptor 113 

G protein-coupled receptor 1 14 
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Table 5 Neuronal and sensory proteins in the predicted proteome of A. digitifera (Continued) 



vl .06608; vl .24223 


K08457 


G protein-coupled receptor 1 15 


vl. 10800 [+ 6 other sequence copies] 


K08458 


G protein-coupled receptor 1 16 


vl .07662 [+ 6 other sequence copies] 


K08462 


G protein-coupled receptor 125 


vl. 09663; vl. 08981 


K08463 


G protein-coupled receptor 126 


vl .24252 


K08464 


G protein-coupled receptor 128 


v1 .02750 [+ 26 other sequence copies] 


K08465 


G protein-coupled receptor 133 


v1 .05774 [+ 1 1 other sequence copies] 


K08466 


G protein-coupled receptor 144 


v1 .05497; vl. 13272; vl.01323 


K08436 


G protein-coupled receptor 152 


v1 .08653 [+ 5 other sequence copies] 


K08467 


G protein-coupled receptor 157 


vl.1 1807; vl .1 0392; vl. 10394 


K08469 


G protein-coupled receptor 158 


v1. 07294; vl. 00247 


K08439 


G protein-coupled receptor 161 


vl.05167 


K08442 


G protein-coupled receptor 176 


V1 .08677; vl .23465; v1 .1 9865; vl .06986 


K12762 


G protein-coupled receptor GPR1 


vl.13395 


K08291 


G protein-coupled receptor kinase 


vl.1 8529; vl .07599; vl .05558 


K1 2487 


G protein-coupled receptor kinase interactor 2 


vl .02481 


K04619 


G protein-coupled receptor family C group 5 member B 


vl .22242 


K04622 


G protein-coupled receptor family C group 6 member A 


V1 .08625; vl .1 3650; vl .13048; v1 .1 8694 


K04599 


G protein-coupled receptor Mth (Methuselah protein) 


v1. 07465; vl. 10540 


K08341 


GABA(A) receptor-associated protein (autophagy-related protein 8) 


v1 .09831 [+ 30 other sequence copies] 


K05270 


Gamma-aminobutyric acid (GABA) receptor, invertebrate 


vl.1 8702; vl.1 1701 


K05183 


Gamma-aminobutyric acid (GABA) A receptor beta-3 


vl .04252 [+ 6 other sequence copies] 


K05185 


Gamma-aminobutyric acid (GABA) A receptor epsilon 


vl .06325 


K05186 


Gamma-aminobutyric acid (GABA) A receptor gamma-1 


vl .00048 


K05188 


Gamma-aminobutyric acid (GABA) A receptor gamma-3 


vl .07506 [+ 6 other sequence copies] 


K04615 


Gamma-aminobutyric acid (GABA) B receptor 1 


v1 .07506 [+ 24 other sequence copies] 


K04616 


Gamma-aminobutyric acid (GABA) B receptor 2 


vl . 06426; vl.1 0563; vl. 01 138 


K05192 


Gamma-aminobutyric acid (GABA) receptor theta 


vl.15485 


K05198 


Glutamate receptor, ionotropic, AMPA 2 


vl .09807 


K05200 


Glutamate receptor, ionotropic, AMPA 4 


vl .04764 


K05207 


Glutamate receptor, ionotropic, delta 2 


v1. 15247 [+ 12 other sequence copies] 


K05313 


Glutamate receptor, ionotropic, invertebrate 


v1 .15247 [+ 7 other sequence copies] 


K05202 


Glutamate receptor, ionotropic, kainate 2 


vl.00617 


K05203 


Glutamate receptor, ionotropic, kainate 3 


v1 .09688 [+ 6 other sequence copies] 


K05208 


Glutamate receptor, ionotropic, N-methyl D-aspartate 1 


vl .21 204 [+ 4 other sequence copies] 


K05212 


Glutamate receptor, ionotropic, N-methyl-D-aspartate 2D 


vl.01622 


K05214 


Glutamate receptor, ionotropic, N-methyl-D-aspartate 3B 


v1 .01418 [+ 5 other sequence copies] 


K05387 


Glutamate receptor, ionotropic, other eukaryote 


vl .04275 


K05194 


Glycine receptor a pha-2 


vl.1 0737; vl .06885 


K05195 


Glycine receptor alpha-3 


vl .05488 


K05271 


Glycine receptor alpha-4 


v1. 08900; vl. 06885 


K05196 


Glycine receptor beta 


vl. 18634 


K05397 


Glycine receptor, invertebrate 


v1. 14569; vl. 14570 


K09071 


Heart-and neural crest derivatives-expressed protein 


v1 .16783 [+ 4 other sequence copies] 


K02168 


High-affinity choline transport protein 


vl.13837 


K07608 


Internexin neuronal intermediate filament protein, alpha 
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Table 5 Neuronal and sensory proteins in the predicted proteome of A. digitifera (Continued) 



vl.01671 


K04309 


Leucine-rich repeat-containing G protein-coupled receptor 4 


vl .09480; vl. 05605 


K04308 


Leucine-rich repeat-containing G protein-coupled receptor 5 


v1 .15300 [+ 8 other sequence copies] 


K08399 


Leucine-rich repeat-containing G protein-coupled receptor 6 


v1. 17524 [+ 14 other sequence copies] 


K04306 


Leucine-rich repeat-containing G protein-coupled receptor 7 


vl 21 700; v1. 03578; vl .1 71 96 


K04307 


Leucine-rich repeat-containing G protein-coupled receptor 8 


vl.16104 


K08396 


Mas-related G protein-coupled receptor member X 


vl .0871 8; v1 .02042; v1. 02042 


K04604 


Metabotropic glutamate receptor 1/5 


v1 .22794 [+ 7 other sequence copies] 


K04605 


Metabotropic glutamate receptor 2/3 


vl .1 5331 


K04607 


Metabotropic glutamate receptor 4 


vl.01418 


K04608 


Metabotropic glutamate receptor 6/7/8 


vl.21698; v1 .04544; vl.21 739 


K14636 


MFS transporter, solute carrier family 18 (acetylcholine transporter) 3 


v1 .05751 ; vl .1 9720; v1 .221 65; vl .02336 


K04134 


Muscarinic acetylchol ne receptor 


vl .1 1 550 


K04129 


Muscarinic acetylchol ne receptor M1 


vl .01 91 3 [+ 4 other sequence copies] 


K04131 


Muscarinic acetylchol ne receptor M3 


vl.18723 


K04132 


Muscarinic acetylchol ne receptor M4 


vl.08171 


K04133 


Muscarinic acetylchol ne receptor M5 


v1. 07408 [+ 34 other sequence copies] 


K02583 


Nerve growth factor receptor (TNFR superfamily member 16) 


vl .1 5265 [+ 91 other sequence copies] 


K06491 


Neural cell adhesion molecule 


v1. 1 3789; vl. 2401 0;v1. 03980 


K09038 


Neural retina-specific leucine zipper protein 


v1 . 24586; v1. 1 6386; vl. 16387 


K08052 


Neurofibromin 1 


v1 .05520; vl .15407; vl .07950 


K04572 


Neurofilament light polypeptide 


vl. 19724 


K04573 


Neurofilament medium polypeptide (neurofilament 3) 


v1 .15787 [+ 4 other sequence copies] 


K09081 


Neurogenin 1 (neurogenic differentiation protein) 


vl .00345; v1 .05338; v1. 10997 


K08033 


Neurogenic differentiation factor 1 


v1 .07355; v1. 14517 


K09078 


Neurogenic differentiation factor 2 


vl .08832 


K09079 


Neurogenic differentiation factor 4 


vl .06678; vl .06677 


K01393 


Neurolysin 


v1. 16238 [+ 19 other sequence copies] 


K06756 


Neuronal cell adhesion molecule 


v1. 20460; vl. 16967 


K06757 


Neurofascin NFASC (cell adhesion molecule CAMs) 


v1. 22060; vl. 03561 


K07525 


Neuronal guanine nucleotide exchange factor 


vl .03908 


K09098 


Neuronal PAS domain-containing protein 1/3 


vl .00089 


K05247 


Neuropeptide FF-amide peptide 


vl.21 565 


K08375 


Neuropeptide FF receptor 2 


vl .06392 [+ 1 1 other sequence copies] 


K04209 


Neuropeptide Y receptor, invertebrate 


v1 .08609 [+ 31 other sequence copies] 


K06819 


Neuropilin 2 


v1 .11492 [+ 5 other sequence copies] 


K03308 


Neurotransmitter:Na+ symporter, NSS family 


v1 .16744 [+ 8 other sequence copies] 


K06774 


Neurotrimin 


vl .05353 


K03176 


Neurotrophic tyrosine kinase receptor type 1 


vl .20055 


K04360 


Neurotrophic tyrosine kinase receptor type 2 


vl .03803 


K04356 


Neurotrophin 3 


vl .09523 


K04803 


Nicotinic acetylcholine receptor alpha-1 (muscle) 


vl .1 1 940 


K04806 


Nicotinic acetylcholine receptor alpha-4 


vl.01548 


K04808 


Nicotinic acetylcholine receptor alpha-6 


vl .05056; vl. 12097 


K04809 


Nicotinic acetylcholine receptor alpha-7 


v! .07222; vl. 11069 


K04810 


Nicotinic acetylcholine receptor alpha-9 
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Table 5 Neuronal and sensory proteins in the predicted proteome of A. digitifera (Continued) 



v1. 18231 [+ 32 other sequence copies] 


K05312 


Nicotinic acetylcholine receptor, invertebrate 


vl .24404 


K04813 


Nicotinic acetylcholine receptor beta-2 (neuronal) 


vl.06514; vl.23640 


K04815 


Nicotinic acetylcholine receptor beta-4 


vl.18634 


K04816 


Nicotinic acetylcholine receptor delta 


vl .1 823 1 [+ 32 other sequence copies] 


K05312 


Nicotinic acetylcholine receptor, invertebrate 


v1 .05293 [+ 78 other sequence copies] 


K02599 


Notch protein 


v1 .15348 [+ 4 other sequence copies] 


K04256 


c-Opsin protein 


vl.01972 


K08385 


GO-Opsin protein 


v1 .13345 [+ 5 other sequence copies] 


K04255 


r-Opsin protein 


vl .00749; vl .03435 


K00504 


Peptidylglycine monooxygenase 


v1 .12323 [+ 1 1 other sequence copies] 


K00678 


Phosphatidylcholine-retinol O-acyltransferase 


vl. 18340 [+ 6 other sequence copies] 


K09624 


Protease, serine, 12 (neurotrypsin, motopsin) 


v1 .08030 [+ 9 other sequence copies] 


K01931 


Protein neuralized 


vl .04431 


K09333 


Retina and anterior neural fold homeobox-like protein 


vl .01 789; vl. 06542 


K00061 


Retinol dehydrogenase 


vl .05804 [+ 6 other sequence copies] 


Kill 50 


Retinol dehydrogenase 8 


v1 .22340; vl. 14029 


K1 1 151 


Retinol dehydrogenase 10 


v1. 24399; v1. 0701 7 


K1 1 1 54 


Retinol dehydrogenase 16 


v1 .19667; v1 .1 6885; v1. 24371 


K00909 


Rhodopsin kinase 


vl .12432; v1. 1 5302; v1. 07505 


K09516 


all-frans-Retinol 13,14-reductase 


vl.09104 [+ 6 other sequence copies] 


K05613 


Solute carrier family 1 (glial high affinity glutamate transporter), member 2 


vl .1 9779; vl .08769; vl. 22032 


K05617 


Solute carrier family 1 (high affinity Asp/glutamate transporter), member 6 


v1. 19293; vl. 19292 


K14387 


Solute carrier family 5 (high affinity choline transporter), member 7 


v1. 10901; vl. 19493 


K05336 


Solute carrier family 6 (neurotransmitter transporter), invertebrate 


vl.24615 [+ 10 other sequence copies] 


K05034 


Solute carrier family 6 (neurotransmitter transporter, GABA) member 1 


vl .07932 


K05046 


Solute carrier family 6 (neurotransmitter transporter, GABA) member 13 


vl.01817 


K05036 


Solute carrier family 6 (neurotransmitter transporter, dopamine) member 3 


v1 .20691 ; v1 .1 6333; vl .15484; vl .021 23 


K05038 


Solute carrier family 6 (neurotransmitter transporter, glycine) member 5 


v1. 1 5484; v1. 15484 


K05042 


Solute carrier family 6 (neurotransmitter transporter, glycine) member 9 


V1 .1 8461 ; vl .09068; v1 .02237; vl .20880 


K05333 


Solute carrier family 6 (neurotransmitter transporter) member 18 


Vl . 02239; v1. 1 3836; vl. 09067 


K05334 


Solute carrier family 6 (neurotransmitter transporter) member19 


vl .21 997 [+ 5 other sequence copies] 


K12839 


Survival of motor neuron-related-splicing factor member 30 


vl .21 997 [+ 6 other sequence copies] 


K13129 


Survival motor neuron protein 



increasing the saturation state of CaC0 3 to sustain cal- 
cium precipitation [146]. Importantly, located also at 
the calicoblastic membrane is carbonic anhydrase [149] 
which is required to catalyse the intermediate step of 
calcification by the reversible hydration of carbon dio- 
xide (C0 2 + H a O => HC0 3 ~ + H + ). In coral photo- 
trophic symbiosis, despite numerous studies describing 
the well-known phenomenon of light-enhanced calcifica- 
tion, the relationship linking symbiont photosynthesis to 
coral calcification has been elusive [150,151]. Nonetheless, 
efforts to better understand the calcifying response of 
scleractinian corals to environmental change and ocean 
acidification are gaining traction [149,152,153]. 



Voltage-gated calcium channels (VGCCs) have been 
examined extensively in mammalian physiology for con- 
verting membrane potential into intracellular Ca + tran- 
sients for signalling transduction pathways (reviewed in 
[154]). VGCC signalling affects cellular processes to in- 
clude muscle contraction, neuronal excitation, gene 
transcription, fertilisation, cell differentiation and devel- 
opment, proliferation, hormone release, activation of 
calcium-dependent protein kinases, cell death via ne- 
crosis and apoptosis pathways, phagocytosis and endo/ 
exocytosis. Remarkably, annotation of the genome of A 
digitifera reveals sequences encoding homologues of all 
the VGCC (a, a5, [3, and y) subunits of the molecular (L, 
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N, P/Q and R) phenotypes expressed in mammalian 
physiology (Table 6). There are multiple sequences 
encoding three variants of Ca 2+ -transporting ATPase, of 
which at least one is necessary for coral calcification. 
There is only one sequence match for expressing carbonic 
anhydrase in the genome of A. digitifera, which may 
reflect the high catalytic efficiency of this calcifying en- 
zyme [155], although a BLAST search of ZoophyteBase 
does reveal scaffolds with low e-values which on future 
experimental inspection might uncover multiple copies of 
this enzyme essential for calcification. There are multiple 
sequences that express solute carrier Na + /Ca 2+ - and Na + / 
K + /Ca 2+ -exchange families of transport proteins that with 
expression of the coral Ca 2+ /H + -antiporter may regulate 
cellular pH and Ca 2+ homeostasis. 

Implicit to coral calcification is Ca 2+ regulation that 
affects signalling of other vital cellular functions. Cellular 
Ca 2+ is mediated by the calcium-sensing receptor 
calmodulin (18 sequence matches) and other messenger 
calcium-binding effectors (Table 6), including the 
calcium-binding protein CML (40 protein domain se- 
quence matches). Calcium/calmodulin-protein kinase pro- 
teins are arguably key to Ca 2+ -signalling in coral symbiosis 
but, with the exception of activation of sperm flagellar mo- 
tility [156], their precise role has not been elaborated. 

Plant-derived proteins 

Endosymbiosis has contributed greatly to eukaryotic 
evolution, most notably to the genesis of plastids and 
mitochondria derived from prokaryotic antecedents. 
Genetic integration by endosymbiont-to-host transfer 
(EGT) or replacement (EGR) has been a significant force 
in early metazoan innovation, whereby nuclear trans- 
ferred genes may even adopt novel functions in the host 
cell or replace existing versions of the protein that they 
encode [157]. Prokaryote-to-eukaryotic gene transfer has 
been widespread in evolution, but examples of genetic ex- 
change between unrelated eukaryotes, such as between 
algal symbionts and their multicellular eukaryote host, are 
considered rare (reviewed by [158,159]). One such ex- 
ample is aroB (3-dehydroquinate synthase) transferred to 
the genome of the sea anemone N. vectensis, which se- 
quence best fits that of the dinoflagellate Oxyrrhis marina 
[119]. Close inspection of the amino acid sequence of the 
aroB gene product, as reported by Shinzato et al. [45], 
clearly shows this protein to be 2-e/?/-5-epi'-valiolone syn- 
thase (EVS), a sugar phosphate cyclase orthologue that ca- 
talyses the conversion of sedoheptulose 7-phosphate to 2- 
epi-5-epi-va\iolone found to be a precursor of the 
mycosporine-like amino acid (MAA) sunscreen shinorine 
in the cyanobacterium Anabaena variabilis [160]. Add- 
itionally, the EVS gene of N. vectensis has a distinctive O- 
methytransferase fusion that is identical in O. marina 
[161]. The shikimate pathway is essential to apicomplexan 



parasites of the genera Plasmodium, Toxoplasma and 
Cryptosporidium and of Tetrahymena ciliates to express a 
pentafunctional aroM gene similar to that of Ascomycetes, 
which is thought to have been conveyed by fungal gene 
transfer to a common ancestral progenitor [162]. In a sep- 
arate example, H. viridis expresses a plant-like ascorbate 
peroxidase gene (HvAPXl) during oogenesis in both sym- 
biotic and aposymbiotic individuals [163], whereby perox- 
idase activity is coincident with oogenesis and embryo 
genesis that in Hydra acts as a ROS scavenger to protect 
the oocyte from apoptotic degradation [164]. The sacog- 
lossan (sea slug) molluscs Elysia chlorotica and E. 
viridis (Plakobranchidae) acquire plastids on ingestion 
of the siphonaceous alga Voucherea litorea (termed 
"kleptoplasty") and, by maintaining sequestered plas- 
tids in an active photosynthetic state, has emerged as a 
model organism for the transfer of nuclear-encoded 
plant genes from algal symbiont to its animal host 
[165]. In this symbiosis, the family of light-harvesting 
genes psbO, prk (phosphoribokinase) and chlorophyll syn- 
thase (chlG) are entrained in the genome of Elysia 
chlorotica (reviewed in [166,167]), although there is debate 
whether these genes are transcriptionally expressed (com- 
pare [168] and [169]). Also, phylogenomic analysis of the 
predicted proteins of the aposymbiotic unicellular choano- 
flagellate Monosiga brevicollis, considered to be a stem 
progenitor of the animal kingdom [170,171], reveals 103 
genes having strong algal affiliations arising from multiple 
phototrophic donors [172]. Such notable examples illus- 
trate the transfer of algal genes to animal recipients. 

KEGG orthology-based annotation of the predicted 
proteome of A. digitifera reveals a plethora of sequences 
presumed to be of algal origin (Table 7). Like E. 
chlorotica, the coral genome has encoded the photo- 
system II (PSII) protein PsbO of the oxygen-evolving 
complex of photosynthesis, as well as the PSII light- 
harvesting complex protein PsbL that is important in 
protecting PSII from photo-inactivation [173]. Encoded 
also are the photosystem I subunit proteins Psal and 
PsaO. Additionally encoded are the photosystem P840 
reaction center cytochrome c551 (PscC) protein and the 
photosynthetic reaction center M subunit protein, the 
light-harvesting proteins complex 1 alpha (PufA), the com- 
plex II chlorophyll alb binding protein 6 (LHCB6), the 
cyanobacterial phycobilisome proteins AcpF and AcpG, 
the phycocyanin-associated antenna protein CpcD, the 
phycocyanobilin lyase protein CpcF and the phycoerythrin- 
associated linker protein CpeS. Like E. chlorotica, the coral 
genome encodes chlorophyll synthase (ChlG), a chlorophyll 
transporter protein PucC, a light-independent nitrogenase- 
like protochlorophyllide reductase enzyme that is sensitive 
to oxygen [174] and a red chlorophyll reductase essential 
to the detoxification of photodynamic chlorophyll catabo- 
lites arising from plant/algal senescence [175]. Three 
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Table 6 Calcification and Ca -signalling proteins in the predicted proteome of A. digitifera 



Gene sequence KEGG Orthology Encoded protein description 



v1 .06452; v1 .06451; vl. 24424; vl .16923 


K07300 


Ca2+:H+ antiporter 


vl.01669 [+ 9 other sequence copies] 


K01537 


Ca2+-transporting ATPase 


vl .22367; vl. 22366; v1 .22365 


K05850 


Ca2+ transporting ATPase, plasma membrane 


vl. 19074 


K05853 


Ca2+ transporting ATPase, sarcoplasmic/endoplasmic reticulum 


vl .2241 6; vl .2241 7; vl .1 5682; v1 .00750 


K14757 


Calbindin D28 


vl .24568 [+ 9 other sequence copies] 


K01672 


Carbonic anhydrase 


v1 .09241 


K08272 


Calcium binding protein 39 


v1 .02323 [+ 39 other sequence copies] 


K1 3448 


Calcium-binding protein CML 


vl.05162 [+ 21 other sequence copies] 


K13412 


Calcium-dependent protein kinase 


v1 .09352 


K07359 


Calcium/calmodu in-dependent protein kinase kinase 


v1 .06475; vl .07555;v1 .00945; vl .001 59; vl .21 1 22 


K08794 


Calcium/calmodu in-dependent protein kinase I 


v1 .06475; vl.01061; vl.21150; v1 .22443 


K04515 


Calcium/calmodu in-dependent protein kinase II 


vl.00159 


K05869 


Calcium/calmodu in-dependent protein kinase IV 


v 1 .2 1 927; v 1 .0 1 2 1 8; v 1 .22226; v 1 .06623; v1 . 1 3 703 


K06103 


Calcium/calmodu in-dependent serine protein kinase 


vl. 13460 


K08284 


Calcium channel MIDI 


vl. 20738; vl .01401 


K12841 


Calcium homeostasis endoplasmic reticulum protein 


v1 .22794 [+ 1 1 other sequence copies] 


K04612 


Calcium-sensing receptor 


vl. 10079 [+ 17 other sequence copies] 


K02183 


Calmodulin 


v1 .10994 


K14734 


SI 00 calcium binding protein G 


v1 .02488 [+ 14 other sequence copies] 


K05849 


Solute carrier family 8 (sodium/calcium exchanger) 


vl .231 53 [+ 9 other sequence copies] 


K13749 


Solute carrier family 24 (sodium/potassium/calcium exchanger) 


vl. 14863 


K12304 


Soluble calcium-activated nucleotidase 1 


vl. 18656 [+ 13 other sequence copies] 


K04858 


Voltage-dependent calcium channel alpha-2/delta-1 


Vl .13222 


K04860 


Voltage-dependent calcium channel alpha-2/delta-3 


v1 .08078 [+ 9 other sequence copies] 


K05315 


Voltage-dependent calcium channel alpha 1, invertebrate 


v1 .03896 [+ 6 other sequence copies] 


K05316 


Voltage-dependent calcium channel alpha-2/delta, invertebrate 


v1 .04798 


K05317 


Voltage-dependent calcium channel beta, invertebrate 


v1 .22788 


K04863 


Voltage-dependent calcium channel beta-2 


v1 .09999 


K04872 


Voltage-dependent calcium channel gamma-7 


v1 .02505 


K04873 


Voltage-dependent calcium channel gamma-8 


v1.03648[+ 6 other sequence copies] 


K04850 


Voltage-dependent calcium channel L type alpha-1 C 


vl. 03648; vl .17267 


K04851 


Voltage-dependent calcium channel L type alpha-1 D 


vl. 03648; vl. 1321 9; vl.21895 


K04857 


Voltage-dependent calcium channel L type alpha-1 S 


vl.06313; vl.01656; v1 .23096 


K04344 


Voltage-dependent calcium channel P/Q type alpha-1 A 


vl. 08078 [+ 10 other sequence copies] 


K04849 


Voltage-dependent calcium channel N type alpha-IB 


v1 .07968 


K04852 


Voltage-dependent calcium channel R type alpha-IE 


v1. 01 364; vl. 13467; v1 .08705 


K04854 


Voltage-dependent calcium channel Ttype alpha-1 G 


v1. 15414; vl. 14241; v1 .09595 


K04855 


Voltage-dependent calcium channel Ttype alpha-1 H 



chlorosome proteins of the photosynthetic antenna com- 
plex of green sulphur bacteria, a bacteriochlorophyll 
methyltransferase involved in BChl c biosynthesis [176] 
and the retinylidene bacteriorhodopsin of phototrophic 
Archaea are also encoded in the coral genome. Present are 



genes encoding subunit 6 of the cytochrome B 6 f complex 
that links PSII and PSI via the plastoquinone pool, to- 
gether with chloroplast ferredoxin-like NapH and NapG 
proteins and their 2Fe-2S cluster protein. The coral gen- 
ome, however, encodes sequences for NAD + -ferredoxin 
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Table 7 Plant-derived proteins in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


v1 


.14452 


K09843 


(+)-Abscisic acid 8'-hydroxylase 


v1 


.18868 


K14496 


Abscisic acid receptor PYR/PYL family (PYL) 


vl .21983; v1 


.05890 


K03342 


p-Aminobenzoate synthetase / 4-amino-4-deoxychorismate lyase (PabBC) 


v1 


.15436 


K02822 


Ascorbate-specific IIB component, PTS system (PTS-Ula-EiiB) 


vl .1 1 1 87; vl 


.13966 


K00423 


L-Ascorbate oxidase 


v1. 20081; v1 


.22465 


K13604 


Bacteriochlorophyll C20 methyltransferase (BchU) 


v1 


.07723 


K04641 


Bacteriorhodopsin (BoP) 


v1 


.21858 


K04040 


Chlorophyll synthase (ChIG) 


v1 


.01742 


K08945 


Chlorosome envelope protein A (CsmA) 


v 1. 04797; vl 


.14208 


K08946 


Chlorosome envelope protein B (CsmB) 


v1 


.18698 


K08948 


Chlorosome envelope protein D (CamD) 


v1 


.18637 


K02642 


Cytochrome b 6 f complex subunit 6 (PetL) 


vl.21 101; vl.14192; vl 


.14548 


K01735 


3-Dehydroquinate synthase (AroB) 


v1 


.05796 


K10210 


4,4'-Diaponeurosporene oxidase (carotenoid biosynthesis; CrtP) 


v1 


.11730 


K04755 


Ferredoxin, 2Fe-2S (FdX) 


vl.19154; vl 


.00014 


K00532 


Ferredoxin hydrogenase 


v1 


.00014 


K00534 


Ferredoxin hydrogenase small subunit 


v1 .1 7698; v1 .06031; v1 


.16647 


K02574 


Ferredoxin-type protein (NapH) 


v1 


.23058 


K02573 


Ferredoxin-type protein (NapG) 


v1 


.08414 


K08926 


Light-harvesting complex 1 alpha chain (PufA) 


v1 


.21458 


K08917 


Light-harvesting complex II chlorophyll a/b binding protein 6 (LHCB6) 


v1 


.03743 


K08226 


MFS transporter, BCD family, chlorophyll transporter (PucC) 


vl .13030; v1 


.08678 


K13413 


Mitogen-activated protein kinase kinase 4/5, plant ((MKK4_5P) 


V1. 02429; v1. 1 0744; vl 


.03340 


K08929 


Photosynthetic reaction center M subunit (PufM) 


v1 


.03631 


K02696 


Photosystem I subunit VIII (Psal) 


v1 


.11432 


K14332 


Photosystem I subunit (PsaO) 


v1 


.17422 


K02713 


Photosystem II protein (PsbL) 


v1 


.18303 


K02716 


Photosystem II oxygen-evolving enhancer protein 1 (PsbO) 


vl.12300; vl 


.21136 


K08942 


Photosystem P840 reaction center cytochrome c551 ((PscC) 


v1 


.00280 


K02097 


Phycobilisome core component 9 (AcpF) 


v1 


.10967 


K02290 


Phycobilisome rod-core linker protein (AcpG) 


v1 


.02166 


K02287 


Phycocyanin-associated, rod protein (CpcD) 


;v1. 07305; vl. 1 9572; vl 


.01248 


K02289 


Phycocyanobilin lyase beta subunit (CpcF) 


Vl 


.10441 


K05382 


Phycoerythrin-associated linker protein (CpeS) 


v1 


.13406 


K10027 


Phytoene dehydrogenase (desaturase; Crtl) 


v1. 18809; v1 


.06199 


K02291 


Phytoene synthase (CrtB) 


; v1 .02037; v1. 14064; v1 


.21095 


K09060 


Plant G-box-binding factor (GBF) 


v1 


.10035 


K00218 


Protochlorophyllide reductase [NifEN-like; Por] 


v1 


.21846 


K05358 


Quinate dehydrogenase (QuiA) 


v1 


.03127 


K13545 


Red chlorophyll catabolite reductase (ACD2) 


v1 


.05899 


K00891 


Shikimate kinase (AroK, AroL) 


vl.21 101; vl.14192; vl 


.05899 


K13829 


Shikimate kinase / 3-dehydroquinate synthase (AroKB) 


v1 


.12938 


K08500 


Syntaxin of plants (SYP6) 
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Table 7 Plant-derived proteins in the predicted proteome of A. digitifera (Continued) 



vl .06575 


K08506 


Syntaxin of plants (SYP7) 


v1 .04929 


K09834 


Tocopherol cyclase (VTE1 , SXD1) 


vl .01022 


K05928 


Tocopherol O-methyltransferase 


vl .05457 


K09838 


Zeaxanthin epoxidase (ZEP, ABA1) 



reductase (HcaD; not tablulated), rather than the required 
NADP + -ferredoxin reductase of photosynthesis. Annotation 
of the A digitifera genome revealed genes unexpectedly en- 
coding ferredoxin hydrogenase [EC:1. 12.7.2] and that of its 
small subunit protein (Table 7) involved in light-dependent 
production of molecular hydrogen having its [Fe-Fe] -cluster 
coupled to the photosynthetic transport chain via a charge- 
transfer complex with ferredoxin (see [177]). 

Like N. vectensis and the dinoflagellate Oxyrrhis marina, 
the genome of A. digitifera encodes an O-methyltransferase 
which is immediately downstream of EVS, but the two 
genes are not fused. Using a ZoophyteBase BlastP search, 
the O-methyltransferase showed little sequence homology 
with the corresponding protein of A variabilis (e-value of 
6.972E' 2 and Bit score of 34.27), whereas the EVS protein 
shared 87% absolute sequence identity to the A. variabilis 
EVS protein. What role, if any, these two genes play in 
mycosporine-like amino acid (MAA) biosynthesis in A. 
digitifera has yet to be determined, although it has been sug- 
gested from the transcriptome of Acropora microphthalma 
that MAA biosynthesis proceeds from a branch point at 
3-dehydroquinate of the shikimic acid pathway as a shared 
metabolic adaptation between the coral host and its symbi- 
otic zooxanthellae [40]. The 3-dehydroquinate synthase 
enzyme of the shikimic acid pathway, thought to be a key 
intermediate in an alternative MAA biosynthetic pathway 
in A. variabilis [178], is instead encoded by the fused 
aroKB gene of A digitifera (Table 7). Additional shikimate 
proteins of the predicted proteome, although not limited 
to phototrophs, are shikimate kinase (AroK), quinate de- 
hydrogenase (QuiA) and the conjoined jj-aminobenzoate 
synthase and 4-amino-4-deoxychlorismate lysate (PabBC) 
enzyme necessary for folate biosynthesis [179]. Other 
plant-related gene homologues include the phytohormone 
abscisic acid receptor protein (PabBC) and its cytochrome 
P450 monooxygenase abscisic acid 8 -hydroxylase, 
L-ascorbate oxidase and PTS system degrading enzymes, 
the unique SYP6 and SYP7 syntaxins of plant vesicular 
transport, tocopherol cyclase and a tocopherol O- 
methyltransferase enzyme that converts y-tocopherol to 
ct-tocopherol. Essential for carotene biosynthesis are 
phytoene synthase (CrtB) and phytoene dehydrogenase 
(CrtI) enzymes. Significantly, encoded within the coral 
genome is zeaxanthin epoxidase that is essential for 
abscisic acid biosynthesis and is a key enzyme in the xan- 
thophyll cycle of plants and algae to impart oxidative 
stress tolerance. 



Given that viruses often mediate gene transfer processes, 
it is intriguing that certain bacteriophages of marine 
Synechococcus and Prochlorococcus cyanobacteria are 
reported to carry genes encoding the photosynthesis Dl 
(psbA), and D2 (psbD) proteins, a high-light inducible 
protein (HLIP) [180,181] and the photosynthetic elec- 
tron transport plastocyanin [petE] and ferredoxin {petF) 
proteins thought to enhance the photosynthetic fitness 
of their host [182-184]. Accordingly, it has been sug- 
gested that the transfer of psbA by viruses associated 
with Symbiodinium could lessen the severity of thermal 
impairment to PSII and the response of corals to ther- 
mal bleaching [185]. It is yet unknown if phages or 
dinoflagellate-infecting viruses [186], particularly those 
of Symbiodinium [187], may affect gene transfer leading 
to complementary (or "shared") metabolic adaptations 
of symbiosis [119,188]. 

Proteins of nitrogen metabolism 

It is well accepted that intracellular Symbiodinium spp. 
provide reduced carbon for coral heterotrophic metabol- 
ism by photosynthetic carbon fixation. Because of this 
metabolic relationship, light is a critical feature in the 
bioenergetics of coral symbiosis [189]. The algal photo- 
synthate translocated to corals, however, is deficient in 
nitrogen at levels necessary to sustain autotrophic 
growth. While corals can assimilate fixed nitrogen from 
surrounding seawater [190], "recycled" nitrogen within 
the symbiosis may account for as much as 90% of the 
photosynthetic nitrogen demand [191]. It would not be 
surprising then that light would have a strong influence 
on the uptake and retention of ammonium by symbiotic 
corals. Consequently, corals excrete excess ammonium 
in darkness [192], and in light excretion is induced by 
treatment with the photosynthetic electron transport in- 
hibitor 3-(3,4-diclorophenyl)-l,l-dimethylurea (DCMU) 
[193]. Since ammonia is the product of nitrogen fixation, 
these observations suggest that the coral holobiont may 
fix nitrogen in the dark, or when photosynthesis is re- 
pressed, during which coral tissues are hypoxic [194], 
and nitrogenase activity is not inactivated by molecular 
oxygen [195]. 

Tropical coral reefs are typically surrounded by low- 
nutrient oceanic waters of low productivity but, paradoxic- 
ally, the waters of coral reefs often have elevated levels of 
inorganic nitrogen [196,197] attributed to high rates of ni- 
trogen fixation. While nitrogen fixation from diazotrophic 
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epiphytes of the coral reef substrata and sediments 
[197,198] and diazotrophic bacterioplankton of the coral 
reef lagoon [199] provide substantial quantities of fixed 
nitrogen for assimilation by the coral reef, mass-balance 
estimates show this input to be less than the community's 
annual nitrogen demand [200]. Endolithic nitrogen-fixing 
bacteria are abundant in the skeleton of living corals 
where they benefit from organic carbon excreted by 
overlaying coral tissues to provide a ready source of en- 
ergy for dinitrogen reduction [201]. Additionally, intra- 
cellular nitrogen-fixing cyanobacteria are reported to 
coexist with dinoflagellate symbionts in the tissues of 
Monastraea cavernosa and to functionally express nitro- 
genase activity [202]. Corals also harbour a diverse as- 
semblage of heterotrophic microorganisms in their skele- 
ton, tissues and lip id-rich mucus (reviewed in [203]), and 
these communities include large populations of diazo- 
trophic bacteria [204,205], and archaea [206]. Apart from 
nitrogen fixation, the coral microbiota contributes to other 
nitrogen-cycling processes, such as nitrification, ammoni- 
fication and denitrification [207,208]. We were surprised to 
find several nitrogen fixation and cycling proteins encoded in 
the genome of A. digitifera (Table 8), notably a nitrogen fix- 
ation NifU-like protein, the Nif-specific regulatory protein 
(NifA), the regulatory NAD(+)-dinitrogen-reductase ADP-D- 
ribosylastransferase protein, a nitrifying ammonia monooxy- 
genase enzyme and nitrate reductase, which are usually 
expressed only by prokaryotic microorganisms. 

The presence of genes encoding proteins involved in 
nitrogen fixation raises speculation that corals may con- 
tribute directly to, or perhaps co-regulate, certain pro- 
cesses that catalyse the reduction of dinitrogen (N2) to 
ammonia (NH 3 ) by the enzyme nitrogenase reductase 
(NifH). The functional NifH enzyme is a binary protein 
composed of a molybdenum-iron (MoFe) protein (NifB/ 
NifDK), or its NifEN homologue, fused with a FeMo- 
cofactor (FeMoco) protein [209]. While genes encoding 
NifB, NifDK (or NifEN) and their FeMo-cofactor do not 
appear in the genome of A. digitifera, a gene encoding the 
NifEN-like protein protochlorophyllide oxidoreductase 
(POR) is present (Table 8). POR has all three subunits 
with high similarity to the assembled MoFe nitrogenase 
[210], but this homologue is unlikely to be effective in ni- 
trogen reduction [211,212] since its activity is light 
dependent [213] when tissues are highly oxic [193]. The 
NifU protein encoded in the coral genome preassembles 
the metallocatalytic Fe-S clusters for maturation of nitro- 
genase [214], but its assemblage without NifS, a cysteine 
desulfurase needed for [Fe-S] cluster assembly [215], 
would be incomplete, and its pre-nitrogenase receptor is 
also missing. Yet, the coral does have the nifj gene that en- 
codes pyruvate:flavodoxin oxidoreductase required for 
electron transport in nitrogenase reduction [216]. The 
regulatory NifA protein encoded in the coral genome 



might activate, on stimulation by the integration host fac- 
tor (INF), transcription of nitrogen fixation (nif) operons 
of RNA polymerase [217], and both of these proteins are 
encoded in the coral genome. Additional to this transcrip- 
tional control, post-translational nitrogenase activity is 
controlled by reversible ADP-ribosylation of a specific 
arginine residue in the nitrogenase complex [218]. NAD 
(+)-dinitrogen-reductase ADP-D-ribosyltransferase (DraT) 
inactivates the nitrogenase complex while ADP-ribosylgly- 
cohydrolase (DraG) removes the ADP-ribose moiety to re- 
store nitrogenase activity, and both of these enzymes are 
encoded in the coral genome. Given that genes encoding 
essential constituent proteins of nitrogenase assembly ap- 
pear incomplete, corals are unlikely to fix nitrogen per se, 
but co-opted elements of the coral genome to regulate 
processes of nitrogen fixation by its diazotrophic consortia 
is a prospect worthy of exploration [219]. 

Nitrofying/nitrifying bacteria and archaea express the 
enzyme ammonia monooxygenase that converts fixed 
ammonia to nitrite (via hydroxylamine) and the enzyme 
nitrite (oxido)reductase completes the oxidation of ni- 
trite to nitrate, and both of these enzymes are entrained 
in the genome of A. digitifera (Table 8). The ammonia 
monooxygenase subunit A (amoA) of archaeal consorts 
has been described in nine species of coral from four 
reef locations [220], but the presence of amoA in the 
coral genome, together with encoded ammonium trans- 
port proteins, was not anticipated. Another protein of 
prokaryotic origin encoded in the coral genome is ni- 
trate reductase (periplasmic, assimilatory and respira- 
tory), the latter being required for anaerobic respiration 
by bacteria [221], and unlike the nitrate reductase family 
of sulphite oxidase enzymes in eukaryotes, the nitrate re- 
ductases of prokaryotes (K00363) belong to the DMSO 
reductase family of enzymes. Also encoded in the coral 
genome are a nitrite transporter (NirC) and a formate- 
dependent nitrite reductase (NrfA) required for nitrite 
ammonification [222]. In addition to nitrite reduction, 
NrfA reduces nitric oxide, hydroxylamine, nitrous oxide 
and sulphite, the last providing a metabolic link between 
nitrogen and sulphur cycling in coral metabolism. Other 
enzymes of nitrogen metabolism encoded in the coral 
genome are the carbamoyl-phosphate synthase family of 
enzymes [223] that catalyses the ATP-dependent syn- 
thesis of carbamoyl phosphate used for the production 
of urea (ornithine cycle) to provide a ready store of 
fixed-N in the urea-nitrogen metabolism of corals [224]. 
Another nitrogen source comes from glutamate de- 
hydrogenase (GDH) that reversibly converts glutamate 
to a-ketoglutarate with liberation of ammonia, and as 
expected [225], this enzyme is encoded in the coral gen- 
ome, together with the prokaryotic nitrogen regulatory 
protein PII of glutamine synthase, which in bacteria is acti- 
vated in response to nitrogen availability. Encoded also is 
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Table 8 Proteins of nitrogen metabolism in the predicted proteome of A. digitifera 



Gene sequence KEGG Orthology 

Vl .23444; vl .091 33; vl .23443 K05521 

v1 .09202 K1 0944 

vl .03645 [+ 8 other sequence copies] K03320 

vl. 1 2268; vl. 12269 K06580 

v1 .02406 K01954 

v1 .01 524; vl.18283; vl .1 8284 K01 948 

vl.01615 K04016 

v1 .1 6277; v1 .23483; v1 .1 3667; vl .22675 K00261 

vl .1 7166; vl.1 1089 K01745 

Vl .22825; v1.08034;v1.o8520 K05123 

vl. 11343 K05951 

vl .00547 K02584 

vl. 18869 K00371 

vl .06763 K08346 

v1 .1 4858; vl .00685; vl .231 48 K0591 6 

v1 .16954 [+ 5 other sequence copies] K02448 

vl.061 15 K02164 

vl .1 7629 [+ 1 2 other sequence copies] K04748 

v1 .24077 [+ 4 other sequence copies] K13125 

v1 .21 801 ; v1 .0571 9; v1 .23577; vl .1 9464 K1 3253 

vl .05980 K00363 

vl.00101 K02598 

v1. 02355; vl. 18772 K04488 

vl.17812 K02589 

vl.09150 K02570 

vl.01560 K02571 

vl. 10035 K00218 

vl .08939 K03737 

vl.17373 K00365 

vl. 13217 K01427 

v1 .16409 [+ 5 other sequence copies] K03187 

vl. 13217 K01429 

vl. 13217 K14048 

vl .1221 1 [+4 other sequence copies] K001 06 

vl.12212 K13481 



(Excluding amino acid and pyrimidine/purine nucleotide synthesis or metabolism). 

histidine ammonia-lyase (histidase) that liberates ammonia 
(and urocanic acid) from cytosolic stores of histidine. It is 
now accepted that uric acid deposits accumulated by symbi- 
otic algae provide a significant store of nitrogen for the coral 
holobiont [226], so it is noteworthy that the coral genome 
encodes urate oxidase (uricase) to catalyse uric acid oxida- 
tion to allanotoin from which urea and ureidoglycolate are 
produced in a reaction catalysed by allantoicase (allantoate 



Encoded protein description 

ADP-ribosylglycohydrolase (DraG) 

Ammonia monooxygenase subunit A 

Ammonium transporter, Amt family 

Ammonium transporter Rh 

Carbamoyl-phosphate synthase (CPS) 

Carbamoyl-phosphate synthase (CPS, ammonia) 

Formate-dependent nitrite reductase (NrfA) 

Glutamate dehydrogenase (NAD(P)+) 

Histidine ammonia-lyase 

Integration host cell factor (INF) subunit beta 

NAD+- dinitrogen-reductase ADP-D-ribosyltransferase (DraT) 

Nif-specific regulatory protein (NifA) 

Nitrate reductase 1, beta subunit 

Nitrate reductase 2, beta subunit 

Nitric oxide dioxygenase 

Nitric oxide reductase NorD protein 

Nitric oxide reductase NorE protein 

Nitric oxide reductase NorQ protein 

Nitric oxide synthase-interacting protein 

Nitric-oxide synthase, invertebrate 

Nitrite reductase (NAD(P)H) small subunit 

Nitrite transporter NirC 

Nitrogen fixation protein NifU 

Nitrogen regulatory protein Pll 1 

Periplasmic nitrate reductase NapD 

Periplasmic nitrate reductase NapE 

Protochlorophyllide reductase [NifEN-like] 

Pyruvate-flavodoxin reductase (NifJ) 

Urate oxidase 

Urease 

Urease accessory protein 
Urease subunit beta 
Urease subunit gamma/beta 
Xanthine dehydrogenase/oxidase 
Xanthine dehydrogenase small subunit 



amidinohydrolase), both of which known isoforms are 
present in the coral genome. Encoded in the coral genome 
is also urease to catalyse the hydrolysis of urea, presumably 
excreted by its algal symbionts, with the release of carbon 
dioxide and ammonia to meet the nitrogen demand of the 
coral holobiont during periods of low nitrogen availability. 
Similarly, xanthine dehydrogenase (xanthine: NAD + -oxido- 
reductase) acts by oxidation on a variety of purines, 



Dunlap et al. BMC Genomics 2013, 14:509 
http://www.biomedcentral.com/1471-2164/14/509 



Page 24 of 59 



including hypoxanthine, to yield urate for the recycling of 
nitrogen in coral nutrition. Many of the aforementioned 
proteins of nitrogen metabolism, including Nif proteins, 
have been detected in the proteome of an endosymbiont- 
enriched fraction of the coral S. pistillata [39]. 

Notwithstanding consideration of the rapid diffusion 
rate of nitric oxide (NO) or its apparent short biological 
half-life [227], there is debate about the provenance of 
endogenously produced NO in signalling the bleaching 
of corals in response to environmental stress. Elevated 
nitric oxide synthase (NOS) activity and NO production 
in algal symbionts has been attributed to the thermal 
stress response of corals [228,229], whereas the host is 
ascribed to be the major source of NO during exposure 
to elevated temperature [230,231]. While our annotation 
may not resolve this dispute, we show (Table 8) that 
nitric oxide synthase enzymes (Nor D, Nor E, Nor Q 
and an invertebrate NOS protein) are encoded in the 
genome of A. digitifera, together with a nitric oxide- 
interacting protein (NOIP) that in higher animals regu- 
lates neuronal NOS activity [232]. Nitric oxide is an 
intermediate of nitrite reduction catalysed by nitrite re- 
ductase (NIR), which by further reduction produces am- 
monia. The coral genome also encodes nitric oxide 
dioxygenase (NOD) that converts nitric oxide to nitrate. 
Accordingly, enhanced expression of NIR (NO reduc- 
tion) or NOD (NO oxidation) could ameliorate the NO- 
signalling response of coral bleaching presumed acti- 
vated by environmental stress. 

DNA repair 

Cellular DNA is prone to damage caused by the products 
of normal metabolism and by exogenous agents. Damage 
to DNA from metabolic processes include the oxidation of 
nucleobases and strand interruptions by the production of 
reactive oxygen species (ROS), from alkylation of nucleo- 
tide bases, from the hydrolysis of bases causing dea- 
mination, depurination and depyrimidination, and from 
the mismatch of base pairs from errors in DNA replica- 
tion. Damage affected by external agents include exposure 
to UV light causing pyrimidine dimerization and free 
radical-induced damage, exposure to ionising radiation 
causing DNA strand breaks, thermal disruption causing 
hydrolytic depurination and single-strand breaks, and by 
xenobiotic contamination to cause DNA adduct forma- 
tion, nucleobase oxidation and DNA crosslinking. Most of 
these lesions affect structural changes to DNA that alter 
or prevent replication and gene transcription at the site of 
DNA damage. Thus, recognition and repair of DNA ab- 
normalities are vital processes essential to maintain the 
genetic integrity of the coral genome. Since there are mul- 
tiple pathways causing DNA damage at diverse molecular 
sites, there are likewise diverse and overlapping processes 
available to repair cellular DNA damage. Of the many 



nuclear repair processes, photoreactivation (photolyase), 
base excision repair and nucleotide excision repair are the 
main elements for the repair of cellular DNA damage. 

Exposure to sunlight is an absolute requirement for 
phototrophic symbiosis, but excessive exposure of corals 
to solar ultraviolet radiation can inflict direct damage to 
DNA by pyrimidine dimerization and 6-4 photoadduct 
formation and cause indirect damage by the production 
of ROS to initiate free-radical damage. While there have 
been abundant studies on the sensitivity of corals to 
solar ultraviolet radiation, only a few have examined the 
effects of solar UV to cause DNA damage. Photoreactiva- 
tion has been shown to be an important repair pathway for 
reversing UV-activated DNA damage in adult coral [233] 
and coral planulae [234]. UV damage to DNA was first 
demonstrated by the detection of unrepaired cyclobutane 
pyrimidine dimers (CPDs) in the host tissues and algal 
symbionts of the coral Porities pontes, in which CPDs had 
increased in a UV dose-dependent manner [235], whereas 
CPDs and 6-4 pyrimidine-pyrimidone photoadducts in the 
coral Montipora verrucosa holobiont were correlated in- 
versely with levels of coral "sunscreen" protection [236]. 
The effects of solar UV radiation causing DNA lesions in 
coral have been determined by use of the comet assay 
[237], and UV-induced DNA damage and repair has been 
examined in the symbiotic anemone Aiptasia pallida 
[238]. The comet assay showed also that DNA lesions in 
coral planulae had increased on acquiring algal symbionts, 
presumably from greater ROS production resulting as a 
by-product of photosynthesis [239]. Iron-induced oxidative 
stress was found likewise to enhance DNA damage in the 
coral Pocillopora damicornis as determined by the oc- 
currence of DNA apurinic/apyrimidinic sites caused by 
hydrolytic lesions [240]. Significantly, DNA damage in the 
host and algal symbionts of the coral Montastraea faveo- 
lata was found to occur simultaneously during thermal 
"bleaching" stress, and DNA damage is further enhanced 
on exposure to greater irradiances of solar radiation [241]. 
Nevertheless, despite the serious risk of unrepaired DNA 
damage to coral survival, the DNA repair processes of 
corals to mitigate the detrimental effects of environmental 
stress have not been adequately characterised at the tran- 
scriptome level of expression [29,242] . 

Our annotation of the sequenced genome of A. digitifera 
has revealed genes encoding a large repertoire of DNA 
repairing enzymes and their adaptor proteins (Table 9). 
Given strong evidence for DNA photoreactivation in 
corals having been reported [233,234], it was surprising to 
find only one gene in single copy that encodes a sole 
photolyase enzyme for reversing pyrimidine dimer and 6- 
4 photoadduct formation. Notably, we found genes encod- 
ing 6 members of the ERCC family of nucleotide excision 
repair enzymes, together with the UV excision repair pro- 
tein RAD23, for the repair of UV-induced DNA damage. 
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Table 9 DNA repair proteins in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


v1. 02961 ;v1. 13402 


K03575 


A/G-specific adenine glycosylase (MutY) 


vl.11766 


K03919 


Alkylated DNA repair protein 


vl .04821 


K10765 


Alkylated DNA repair protein alkB homologue 1 


v1 .02479 


K10766 


Alkylated DNA repair protein alkB homologue 4 


vl .20302 


K10767 


Alkylated DNA repair protein alkB homologue 5 


vl .24450 


K10768 


Alkylated DNA repair protein alkB homologue 6 


vl .02766; vl. 0941 3 


K10770 


Alkylated DNA repair protein alkB homologue 8 


v1 .01 590 [+ 4 other sequence copies] 


K10884 


ATP-dependent DNA helicase 2 subunit 1 


vl . 1 881 0; vl . 031 66; vl. 08449 


K10885 


ATP-dependent DNA helicase 2 subunit 2 


vl.08013 


K03722 


ATP-dependent DNA helicase DinG 


vl .03542 


K14635 


ATP-dependent DNA helicase MPH1 


v1 .06737 [+ 5 other sequence copies] 


K15255 


ATP-dependent DNA helicase PIF1 


v1 .1 7360; vl.21235 


K10899 


ATP-dependent DNA helicase Q1 


vl .01 081 [+ 8 other sequence copies] 


K10730 


ATP-dependent DNA helicase Q4 


vl.16859 


K1 0902 


ATP-dependent DNA helicase Q5 


vl .1 1 661 [+ 19 other sequence copies] 


K03654 


ATP-dependent DNA helicase RecQ 


vl .20397 


K03656 


ATP-dependent DNA helicase Rep 


vl. 1 8049; v1. 07731; v1 .05830 


K1 0905 


ATR interacting protein 


vl.01679 


K01669 


Deoxyribodipyrimidine photo-lyase 


v1 .0341 0; v1 .1 2968; v1 .00865; vl .1 6876 


K10887 


DNA cross-link repair 1C protein 


v1 . 07474; vl. 07473; vl. 01 809 


K10610 


DNA damage-binding protein 1 


v1. 1 31 16; v1. 03378; vl. 16328 


K10140 


DNA damage-binding protein 2 


vl .1 7099 [+ 5 other sequence copies] 


K11885 


DNA damage-inducible protein 1 


vl .05469 


K06663 


DNA damage checkpoint protein 


v1 .02859; v1 .1 471 9; v1 .21 030; vl .1 0920 


K04452 


DNA damage-inducible transcript 3 


vl.02191 


K1 0844 


DNA excision repair protein ERCC-2 


vl .191 08 [+ 5 other sequence copies] 


K1 0843 


DNA excision repair protein ERCC-3 


v1 .22267 [+ 4 other sequence copies] 


K10848 


DNA excision repair protein ERCC-4 


vl .1 51 37 [+ 5 other sequence copies] 


K10846 


DNA excision repair protein ERCC-5 


vl .1 8550; v1 .02606; vl .14935; vl .08831 


K10841 


DNA excision repair protein ERCC-6 


v1 .20045; vl .01 844; v1 .1 1 724; vl .03203 


K10570 


DNA excision repair protein ERCC-8 


v1. 1 5430; vl. 03058 


K03658 


DNA helicase IV 


v1 .00228 [+ 4 other sequence copies] 


K1 1 665 


DNA helicase INO80 


v1.00136;v1.0678;v1.21529 


K10776 


DNA ligase 3 


v1. 23293; vl.19418; v1 .23430; vl .1 5721 


K10777 


DNA ligase 4 


vl. 19248 


K07458 


DNA mismatch endonuciease, patch repair protei 


vl. 19011 


K08739 


DNA mismatch repair protein MLH3 


vl .1 1 51 3; vl .1 1449 


K08735 


DNA mismatch repair protein MSH2 


vl .14781 


K08736 


DNA mismatch repair protein MSH3 


v1 . 05696; v1. 22444; vl. 191 62 


K08740 


DNA mismatch repair protein MSH4 


vl .04904 


K08741 


DNA mismatch repair protein MSH5 


v1 .15360; v1 .1 9426; v1. 08585 


K08737 


DNA mismatch repair protein MSH6 


v1 .02429 [+ 8 other sequence copies] 


K03572 


DNA mismatch repair protein MutL 
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Table 9 DNA repair proteins in the predicted proteome of A. digitifera (Continued) 



vl .03990 


K03555 


DNA mismatch repair protein MutS 


vl.14015 


K07456 


DNA mismatch repair protein MutS2 


vl .08443 


K1 0864 


DNA mismatch repair protein PMS1 


vl. 15229 


K10858 


DNA mismatch repair protein PMS2 


v1 .08658; v1. 14152; vl.01681 


K15082 


DNA repair protein RAD7 


v1. 16407 [+ 27 other sequence copies] 


K10866 


DNA repair protein RAD50 


vl.22193 


K04482 


DNA repair protein RAD51 


vl .02646; vl .22076 


K10958 


DNA repair protein RAD57 


vl .1 5671 [+ 4 other sequence copies] 


K04483 


DNA repair protein RadA 


vl .16193; vl.1 9033 


K04485 


DNA repair protein RadA/Sms 


vl .16079; vl .07685 


K04484 


DNA repair protein RadB 


vl.21363; v1 22360; vl .02900 


K03584 


DNA repair protein RecO (recombination protein O) 


vl. 18390 


K03515 


DNA repair protein REV1 


vl .04705 


K10991 


DNA repair protein Swi5/Sae3 


vl .13920; vl . 03800; vl. 161 33 


K10803 


DNA repair protein XRCC1 


vl.15052 


K10879 


DNA repair protein XRCC2 


vl .0931 5 [+ 4 other sequence copies] 


K1 0886 


DNA repair protein XRCC4 


v1 .02733; vl. 24592 


K10868 


DNA repair protein XRS2 


v1. 14551; vl.23176 


K10873 


DNA repair and recombination protein RAD52 


vl .20503 [+ 4 other sequence copies] 


K10875 


DNA repair and recombination protein RAD54 


v1. 231 73; vl. 16050 


K10877 


DNA repair and recombination protein RAD54B 


v1 .07227; v1 .08907; v1 .09439; vl .02644 


K10847 


DNA repair protein complementing XP-A cells 


vl .1 1 534 [+ 5 other sequence copies] 


K1 0865 


Double-strand break repair protein MRE1 1 


vl .07939 


K03660 


N-glycosylase/DNA lyase 


vl.16163 


K03652 


3-Methyladenine DNA glycosylase 


vl .07231 


K10726 


Replicative DNA helicase Mem 


vl .05482 


K04499 


RuvB-like protein 1 (pontin 52) 


vl.1 9813 


K11338 


RuvB-like protein 2 


v1 .06890 


K15080 


Single-strand annealing weakened protein 1 


v1. 17193; vl. 14087 


K03111 


Single-strand DNA-binding protein 


v1 .15575 


K10800 


Single-strand monofunctional uracil DNA glycosylase 


vl.07134 


K1 0992 


Swi5-dependent recombination DNA repair protein 1 


vl.13860 


K03649 


TDG/mug DNA glycosylase family protein 


vl . 1 4423; vl.1 4399; vl. 05070 


K03648 


Uracil-DNA glycosylase 


v1 .23838 


K10791 


Three prime repair exonuclease 2 


vl. 19522 


K10839 


UV excision repair protein RAD23 



More abundant are the DNA mismatch repair enzymes 
from the MLH, MSH, Mut and PMS protein families and 
related glycosylase/lyase proteins for repairing erroneous 
insertion, deletion and mis-incorporation of bases to arise 
during DNA replication and recombination. There is add- 
itionally a specific gene that encodes a 3 '-endo nuclease 
protein that has a preference to correct mispaired nucleo- 
tide sequences. Abundant also are other members of the 
RAD-family of DNA repair proteins, including 28 sequence 



copies of a gene encoding the RAD50 protein for DNA 
double-strand break repair that, together with members of 
the MRE, Rec, REV, Swi5/Sae3, XRCC and XRS families of 
recombination and polymerase proteins, have complemen- 
tary roles in DNA repair. Apparent also in the genome are 
the DNA helicase proteins, including RuvB-like proteins, 
which are primarily involved in DNA replication and tran- 
scription, but assist also in the repair of DNA damage by 
separating double strands at affected sites of DNA damage 
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to facilitate repair. Of the multiple families of ATP- 
dependent DNA helicase proteins encoded in the coral 
genome, RecQ and helicase Q predominate. Encoded in 
the coral genome are 5 homologues of the DNA repair alkB 
proteins that reverse damage to DNA from alkylation 
caused by chemical agents by removing methyl groups 
from 1 -methyl adenine and 3-methyl cytosine products in 
single-stand DNA. Annotated also are genes encoding 
DNA ligase 3 for repairing single-strand breaks, DNA ligase 
4 to repair double-strand breaks, and a DNA cross-link re- 
pair 1C protein with single-strand specific endonuclease ac- 
tivity that may serve in a proofreading function for DNA 
polymerase. Taken together, expressing this arsenal of DNA 
protection may provide corals with limited ability to tran- 
scribe gene-encoded adaptation to a changing global 
environment. 

Stress response proteins 

Annotation of the A. digitifera genome reveals a wide 
assortment of thermal shock proteins, molecular chaper- 
ones and other stress response elements that are given in 
(Table 10), excluding antioxidant and redox-protective 
proteins which are described in the next section. Heat 
shock proteins 70 kDa, 90 kDa, HOkDA, HspQ and HspX 
(the last two proteins being homologues of the bacterial 
heat shock factor sigma32 and a-crystallin, respectively) 
are encoded in the coral genome, together with several 
HSP gene transcription factors. HSPs play a role in various 
cellular functioning such as protein folding, intracellular 
protein trafficking and resistance to protein denaturation. 
HSP expression is usually increased on exposure to ele- 
vated temperatures and other conditions of biotic and abi- 
otic stress that include infection, inflammation, metabolic 
hyperactivity, exposure to environmental toxicants, ultra- 
violet light exposure, starvation, hypoxia and desiccation 
[243]. HSPs and chaperones are transcriptionally regulated 
and are induced by heat shock transcription factors [244], 
of which there are several encoded in the coral genome. 
Since HSPs are found in virtually all living organisms, it is 
not surprising that cnidarian hsp transcription and protein 
expression (HSP60, HSP70 and HSP90) have been profiled 
as a stress determinant [245-250] and early warning indi- 
cator of coral bleaching [251-254]. The coral genome re- 
veals also a cold shock protein encoded by the cspA gene 
family, but profiling its expression with other stress re- 
sponse proteins activated by sub-optimum cold tempera- 
tures [255] has not been reported. Additionally, the coral 
genome encodes transcription of a homologue of the uni- 
versal stress protein A (UspA), a member of an ancient and 
conserved group of stress-response proteins [256,257], 
which have been studied mostly in bacteria [258] but have 
been described also in several plants [259] and animals, in- 
cluding members of the Cnidaria [260]. lisp transcripts 
have been quantified in the thermal stress response of the 



coral Montastraea faveolata [261] and its aposymbiotic 
embryos [262]. Another gene product of potential interest 
is a homologue of the oxidative-stress responsive protein 1 
(OXSR1) that belongs to the Ser/Thr kinase family of pro- 
teins, as do other mitogen-stress activated protein kinases 
(MAPKs), that regulate downstream kinases in response to 
environmental stress [263] by interacting with the Hsp70 
subfamily of proteins [264]. Another significant response 
protein encoded in the coral genome (Table 10) is a 
homologue of the stress-induced phosphoprotein 1 (30 do- 
main sequence alignments), known also as the Hsp70- 
Hsp90 organising protein (HOP) belonging to the stress in- 
ducible (STI1) family of proteins, which is a principle 
adaptor protein that mediates the functional cooperation of 
molecular chaperones Hsp70 and Hsp90 [265,266]. It is yet 
to be determined if Hopl transcription may serve as a pri- 
mary indicator of environmental stress in corals. 

Molecular chaperones are a diverse family of proteins 
expressed by both prokaryotic and eukaryotic organisms 
that serve to maintain correct protein folding in a 3- 
dimensional functional state, assist in multiprotein com- 
plex assembly and protect proteins from irreversible 
aggregation at synthesis and during conditions of cellu- 
lar stress [267]. Additionally, heat shock proteins and 
their co-chaperones may regulate cell death pathways by 
inhibition of apoptosis [268]. The coral genome encodes 
a large number of DnaJ subfamily (J-domain) chaperones 
(Hsp40) that with co-chaperone GrpE (Table 10) re- 
gulates the ATPase activity of Hsp70 (DnaK in bacteria) 
to enable correct protein folding [269]. The coral gen- 
ome encodes homologues of the molecular chaperones 
HscA (specialised Hsp70), the redox-regulated chape- 
rone Hsp33, HtpG (high temperature protein G), mem- 
bers of the calnexin/calreticulin chaperone system of the 
endoplasmic reticulum, a mitochondrial chaperone BCS1 
protein necessary for the assembly of the respiratory chain 
complex III and a specific chaperone of trimethyl N-oxide 
reductase (Tor A). The coral genome also encodes 
hypoxia-inducible factors (HIFs) that moderate the dele- 
terious effects of hypoxia on cellular metabolism (reviewed 
in [270]). In the HIF signalling cascade, the alpha subunits 
of HIF are hydroxylated at conserved proline residues by 
HIF prolyl-hydroxylases allowing their recognition for 
pro-teasomal degradation, which occurs during normoxic 
conditions but is repressed by oxygen depletion. Hypoxia- 
stabilised HIF1 upregulates the expression of enzymes 
principally of the oxygen-independent glycolysis pathway, 
and in higher animals promotes vascularisation, whereas 
the mammalian HIF2 paralogue regulates erythropoietin 
control of hepatic erythrocyte production in response to 
hypoxic stress [271]. The roles of HIF1 and HIF2 homo- 
logues in corals have been established, with HIF1 regula- 
tion of glycolysis critical to metabolic function during the 
dark diurnal anoxic state of coral respiration [193,272]. 
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Table 1 0 Stress response proteins in the predicted proteome of A. digitifera 





Gene sequence 


KEGG Orthology 


Encoded protein description 




v1. 0461 6; vl. 06277 


K03694 


ATP-dependent Clp protease subunit ClpA 


V1 


.0461 7; v1 .23486; v1 .23484; vl .1 0207 


K03695 


ATP-dependent Clp protease subunit CIpB 




vl. 13464 


K03697 


ATP-dependent Clp protease subunit CIpE 




vl .06903; vl.1 1461 


K06891 


ATP-dependent Clp protease adaptor protei 




vl.12577; vl .09531; vl.1 7184 


K03544 


ATP-dependent Clp protease subunit CIpX 




vl .09407 


K08054 


Calnexin (protein-folding chaperone) 




vl .16781 


K08057 


Calreticulin (Ca 2+ -binding 


chaperone) 




vl .04005 


K10098 


Calreticulin 3 (Ca 2+ -bindin 


g chaperone) 


\ 


/1.02702[+ 5 other sequence copies] 


K03704 


Cold shock protein (beta-ribbon, CspA famil 




vl. 01 907; vl.1 8998 


K07213 


Copper chaperone 






v1 .23457; vl .01 71 3; vl .1 9228 


K04569 


Copper chaperone for superoxide dismutas€ 




vl .08719; vl.19128 


K09502 


DnaJ homologue subfami 


y A member 1 




v1 .08719; vl.18432 


K09503 


DnaJ homologue subfami 


y A member 2 




v1. 16210; vl .22054 


K09504 


DnaJ homologue subfami 


y A member 3 




vl.19128 


K09505 


DnaJ homologue subfami 


y A member 4 


V 


1.04818 [+ 6 other sequence copies] 


K09506 


DnaJ homologue subfami 


y A member 5 




v1. 02841 ; vl. 02842 


K09507 


DnaJ homologue subfami 


y B member 1 


V1 


.00368; v1 .1 3308; vl .16977; vl .03340 


K09508 


DnaJ homologue subfami 


y B member 2 


vl 


.1 1 537; vl .09205; v1 .08628; vl .02840 


K09511 


DnaJ homologue subfami 


y B member 5 


V 


1.24549 [+ 9 other sequence copies] 


K09512 


DnaJ homologue subfami 


y B member 6 




vl.01573 


K09513 


DnaJ homologue subfami 


y B member 7 




v1 . 00352; v1. 091 96; vl. 06645 


K09514 


DnaJ homologue subfami 


y B member 8 


V 


.18536 [+ 4 other sequence copies] 


K09515 


DnaJ homologue subfami 


y B member 9 




vl. 14710 


K09517 


DnaJ homologue subfami 


y B member 1 1 




vl. 14959 


K09518 


DnaJ homologue subfami 


y B member 12 




vl .09205 


K09519 


DnaJ homologue subfami 


y B member 13 




vl.1 6242 


K09520 


DnaJ homologue subfami 


y B member 14 




vl .201 09; vl .03468 


K09521 


DnaJ homologue subfami 


y C member 1 


V 


1 .071 1 1 [+ 5 other sequence copies] 


K09522 


DnaJ homologue subfami 


y C member 2 


v1 


21077 [+ 13 other sequence copies] 


K09523 


DnaJ homologue subfami 


y C member 3 




v1. 07739; vl. 22910 


K09524 


DnaJ homologue subfami 


y C member 4 


vl 


01239 [+ 13 other sequence copies] 


K09525 


DnaJ homologue subfami 


y C member 5 


vl 


17629 [+ 29 other sequence copies] 


K09527 


DnaJ homologue subfami 


y C member 7 




vl .1861 9; vl .08300; vl .23789 


K09528 


DnaJ homologue subfami 


y C member 8 




vl .1 3575; vl .04213 


K09529 


DnaJ homologue subfami 


y C member 9 


vl 


.05956; v1 .05955; v1 .2 1 265; v1 .21 205 


K09530 


DnaJ homologue subfami 


y C member 10 




vl.13525; vl.04120 


K09531 


DnaJ homologue subfami 


y C member 1 1 


V 


1.09496 [+ 4 other sequence copies] 


K09533 


DnaJ homologue subfami 


y C member 1 3 




vl .24546 


K09534 


DnaJ homologue subfami 


y C member 14 




vl .05866 


K09536 


DnaJ homologue subfami 


y C member 16 




v1. 1 61 51 ;v1. 08307; vl. 14980 


K09537 


DnaJ homologue subfami 


y C member 1 7 




vl. 16309 


K09539 


DnaJ homologue subfami 


y C member 19 




vl .05241 ; v1 .22999; vl .1 7372 


K14258 


Facilitated trehalose transporter (anhydrobio 
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Table 10 Stress response proteins in the predicted proteome of A. digitifera (Continued) 



v1. 1 2967; vl. 19789 


K14590 




FtsJ methyltransferase [heat shock protein] 


vl .02247 


K09414 




Heat shock transcription factor 1 


Vl .241 1 2 


K09416 




Heat shock transcription factor 3 


vl .05839 


K09419 




Heat shock transcription factor, other eukaryote 


v1. 12890 [+ 10 other sequence copies] 


K03283 




Heat shock 70 kDa protein 1/8 


vl .07996 


K09489 




Heat shock 70 kDa protein 4 


vl .02854; v1 .07452; v1. 01 623 


K09490 




Heat shock 70 kDa protein 5 


vl .14149; vl .141 50 


K09487 




Heat shock protein 90 kDa beta 


v1 .07995; v1 .07996; vl .1 6399; vl .1 1 283 


K09485 




Heat shock protein 1 1 0 kDa 


vl .08943; vl .05577 


K1 1 940 




Heat shock protein HspQ 


v1 .00537; vl .00043 


K03799 




Heat shock protein HtpX 


vl.01623 


K04046 




Hypothetical chaperone protein 


vl .1 621 6 


K08268 




Hypoxia-inducible factor 1 alpha 


v1 .08869; vl .1 51 20 


K09097 




Hypoxia-inducible factor 1 beta 


vl .22724 


K09095 




Hypoxia-inducible factor 2 alpha 


v1 .23698 [+ 16 other sequence copies] 


K0671 1 




Hypoxia-inducible factor prolyl 4-hydroxylase 


v1 .16737; vl .22345 


K09486 




Hypoxia up-regulated 1 (heat shock protein 70 family) 


vl.10188 


K08900 




Mitochondrial chaperone BCS1 


vl.17197; vl.04394 


K04445 




Mitogen-stress activated protein kinases 


vl. 1 6301; v1. 21224; v1 .19344 


K04043 




Molecular chaperone DnaK 


v1 .09682; v1 .1 6748; v1 .07471 ; vl .1 3624 


K03687 




Molecular chaperone GrpE 


vl.01621; vl . 04945; vl. 1591 9 


K04044 




Molecular chaperone HscA 


vl. 18210 


K04083 




Molecular chaperone Hsp33 


v1 .1 7478; v1 .1 6977; vl .10289; v1 .1 9907 


K04079 




Molecular chaperone HtpG 


v1. 08895; vl. 18099 


K11416 




Mono-ADP-ribosyltransferase sirtuin 6 


vl .02024 


K1 141 1 




NAD-dependent deacetylase sirtuin 1 


vl .04813 


K1 1412 




NAD-dependent deacetylase sirtuin 2 


vl.22049; vl.2221 1; vl.02221 


K11413 




NAD-dependent deacetylase sirtuin 3 


vl .1 1849; v1. 02221 


K11414 




NAD-dependent deacetylase sirtuin 4 


vl .05495 


K11415 




NAD-dependent deacetylase sirtuin 5 


v1 .04868 


K11417 




NAD-dependent deacetylase sirtuin 7 


v1 .15070 [+ 4 other sequence copies] 


K08835 




Oxidative-stress responsive protein 1 (OXSR1) 


vl .04503 


K11875 




Proteasome assembly chaperone 1 


vl .01531 


K11878 




Proteasome assembly chaperone 4 


wi nm n 


U"l 1 Q7Q 

k. i i o/y 




Proteasome chaperone 1 


vl .1 861 1 


Kl 1 880 




Proteasome chaperone 2 


v1 .00599 [+ 29 other sequence copies] 


K09553 




Stress-induced-phosphoprotein 1 (HOP1) 


v1 .08830 


K13057 




Trehalose synthase (anhydrobiosis) 


vl .22042 


K03533 




TorA specific chaperone 


vl. 16986 [+ 7 other sequence copies] 


K06149 




Universal stress protein A 


Heat shock proteins that repair unfolded 


or misfolded 


cellular proteostasis [274,275]. Thus, several proteasome 


protein have a complementary function to the ubiquitin- 


chaperones and assembly chaperones are encoded in the 


proteasome system (ubiquitins not tabulated) that selects 


A. digitifera 


genome (Table 10). While proteasome cha- 


damaged protein for degradation [273], such that HSP 


perones serve to target aberrant proteins for ubiquination, 


chaperones and the proteasome act jointly to preserve 


the proteasome chaperones facilitates 20S assembly for 
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biogenesis of the multiunit 26S proteasome that is acti- 
vated in response to stress [276,277], possibly by FtsJ (aka 
RrmJ), a well-conserved heat shock protein having novel 
ribosomal methyltransferase activity that targets methyla- 
tion of 26S rRNA under heat shock control [278,279]. The 
HspQ protein encoded in the coral genome, although 
studied almost exclusively in bacteria, is known to stimu- 
late degradation of denatured proteins caused by hyper- 
thermal stress, particularly DnaA that initiates DNA 
replication in prokaryotes [280]. Specifically, HspQ (heat 
shock factor sigma32) regulates the expression of Clp 
ATPase-dependent protease family enzymes [281,282], of 
which ClpA, ClpB, ClipE, the protease adaptor protein 
ClpS [283] and the unfoldase ClpX protein [284] are 
encoded in the coral genome (Table 10). HspX is a small 
16 kDa a-crystallin chaperone (Acr) protein belonging to 
the Hsp20 family of proteins [285] that suppresses thermal 
denaturation and aggregation of proteins [285]. Signifi- 
cantly, Acr proteins are known to bind with carbonic 
anhydrase [286] and may have importance in moderating 
stress-induced loss of calcium deposition. Thus, HspX/ 
Acr expression may account for differences in the thermal 
sensitivity of corals to calcification that varies among gen- 
era [287]. In a different context, HspX is attracting consid- 
erable attention for its potential to elicit long-term 
protective immunity against human Mycobacterium tuber- 
culosis infection by chaperoning a host-protective antigen 
[288] that by extension, but yet untested, may likewise re- 
press virulence in the initiation and progression of micro- 
bial coral disease [289,290]. 

The coral genome encodes complete membership of the 
human sirtuin (SIRT1-7) family of NAD(+)-dependent 
protein deacetylases and ADP-ribosyltransferases. Mam- 
malian SIRT1 (a homologue of yeast Sir2) is an important 
regulator of metabolism, cell differentiation, stress re- 
sponse transcription and pathways of cellular senescence 
(reviewed in [291]). SIRT proteins regulate chromatin 
function through deacetylation of histones that promote 
subsequent alterations in the methylation of histones and 
DNA to affect, via deactivation of nuclear transcription 
factors and co-regulators, epigenetic control of nuclear 
transcription. As NAD + -dependent enzymes, SIRT1 can 
regulate gene expression in response to cellular NAD + / 
NADH redox status providing a metabolic template for 
epigenetic transcriptome reprogramming [292,293]. In 
the human genome repertoire, SIRT1 modulates cellular 
responses to hypoxia by deacetylation of HIFla [294] 
and inhibits nitric oxide synthesis by suppression of 
the nuclear factor-kappaB (NF-kB) signalling pathway 
[295], SIRT2 promotes oxidative stress resistance by 
deacetylation of forkhead box O (FOXO) proteins [296], 
SIRT3 decreases ROS production in adipocytes [297], 
SIRT4 regulates fatty acid metabolism and stress- 
response elements of mitochondrial gene expression 



[298], SIRT5 is a protein lysine desuccinylase and 
demalonylase of unknown function [299], SIRT6 acti- 
vates base-excision repair [300] and SIRT7 inhibits 
apoptosis induced by oxidative stress by deacylation of 
p53 [301,302]. The significance of coral SIRT proteins, 
by analogy, to exert stress tolerance is yet to be 
examined. 

Metallochaperones are an important class of enzymes 
that transport co-factor metal ions to specific proteins 
[303]. The copper chaperone protein ATX1 (human 
ATOX1) delivers cytosolic copper to Cu-ATPase proteins 
and serves as a metal homeostasis factor to prevent 
Fenton-type production of highly reactive hydroxyl radi- 
cals. ATX1, which is strongly induced by molecular oxy- 
gen, functions additionally as an antioxidant to protect 
cells against the toxicity of both the superoxide anion and 
hydrogen peroxide [304]. Encoded also is a specific copper 
chaperone essential to the activation of Cu/Zn superoxide 
dismutase [305,306] that is enhanced by photooxidative 
stress in scleractionian corals [307], although reported to 
be less pronounced in the host than in symbiotic algae 
[308]. In addition to high light exposure, reef-building 
corals of shallow reef flats are occasionally exposed to the 
atmosphere for periods that can last several hours during 
extreme low tides. Hence, species that are adapted to 
withstand acute desiccation (anhydrobiosis) have a better 
chance of surviving such conditions. The disaccharide tre- 
halose is an osmolyte that in some plants and animals al- 
lows them to survive prolonged periods of desiccation 
[309]. The hydrated sugar has high water retention that 
forms a gel phase when cells dehydrate, which on rehydra- 
tion allows normal cellular activity to resume without 
damage that would otherwise follow a dehydration/rehy- 
dration cycle. Furthermore, trehalose is highly effective in 
protecting enzymes in their native state from inactivation 
from thermal denaturation [310]. Given that A. digitifera 
is endemic on shallow reef flats prone to exposure at low 
tides [311], it is not surprising that the coral genome en- 
codes trehalose synthase and a facilitated trehalose trans- 
porter for protection against dehydration. 

Antioxidant and redox-protective proteins 

Oxygen is vital for life, but it can also cause damage to 
cells, particularly at elevated levels. In coral symbiosis, 
the photosynthetic endosymbionts of corals typically 
produce more oxygen than the holobiont is able to con- 
sume by respiration, so that coral tissues are hyperoxic 
with tissue p0 2 levels often exceeding 250% of air satur- 
ation during daylight illumination [193]. Furthermore, 
because algal symbionts reside within the endodermal 
cells of their host, coral tissues must be transparent to 
facilitate the penetration of downwelling light required 
for photosynthesis by their algal consorts. In clear shallow 
waters this entails concurrent exposure to vulnerable 
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molecular sites of both partners to damaging wavelengths 
of ultraviolet radiation. The synergistic effects of tissue 
hyperoxia and UV exposure can cause oxidative damage 
to the symbiosis via the photochemical production of 
cytotoxic oxygen species [312] that are produced also dur- 
ing normal mitochondrial function [313]. Consequently, 
protective proteins (antioxidant enzymes) are expressed to 
maintain the fine balance between oxygen metabolism and 
the production of potentially toxic reactive oxygen species 
(ROS). If this balance is not maintained by regulation of 
oxidative and reductive processes (redox regulation), oxi- 
dative stress occurs by the generation of excess ROS, caus- 
ing damage to DNA, proteins, and lipids. Corals elaborate 
a variety of molecular defences that including the produc- 
tion of UV-protective sunscreens, (MAAS), antioxidants, 
antioxidant enzymes, chaperones and heat shock proteins, 
which are often inducible under conditions of enhanced 
oxidative stress [307], including conditions that elicit coral 
bleaching [314,315]. An excellent review on the formation 
of ROS and the role of antioxidants and antioxidant en- 
zymes in the field of redox biology is given by Halliwell 
[316]. 

Annotation of the A. digitifera genome reveals se- 
quences encoding two isoforms of the antioxidant enzyme 
superoxide dismutase (SOD) from both the Cu/Zn and 
Fe/Mn families of SOD (Table 11). These metalloprotein 
enzymes catalyse the dismutation of superoxide to yield 
molecular oxygen and hydrogen peroxide, the latter being 
less harmful than superoxide. Superoxide can oxidize pro- 
teins, denature enzymes, oxidize lipids and fragment 
DNA. By removing superoxide, SOD protects also against 
the production of reactive peroxynitrite formed by the 
combination of superoxide and nitric oxide, which is a 
precursor reactant for production of the supra-reactive hy- 
droxyl radical. Hydrogen peroxide per se is a mild oxidant, 
but it readily oxidises free cellular ferrous iron to ferric 
iron with production of hydroxyl radicals via the Fenton 
reaction. Accordingly, both the removal of hydrogen per- 
oxide and the expression of proteins, such as transferrin, 
(bacterio)ferritins and metallothioneins, that bind reactive 
(transition) metal ions is important to protect cellular 
components from acute oxidative damage. Oddly, only a 
metallothionein expression activator was found encoded 
in the coral genome without finding a sequence to activate 
transcription of the actual metallothionein protein gene. 

As expected from the foregoing, the genome of A. 
digitifera encodes the antioxidant enzyme catalase (CAT) 
that is highly efficient in decomposing hydrogen peroxide 
to yield molecular oxygen and water. Two isoforms of 
CAT are encoded at multiple sites. One is a peroxisomal 
eukaryotic CAT enzyme that targets the removal of hydro- 
gen peroxide formed as a by-product of oxidase enzymes, 
and the other is a related catalase domain-containing pro- 
tein presumed also to decompose hydrogen peroxide. 



Glutathione peroxidise (GPx) reduces both hydrogen per- 
oxide and lipid hydroperoxides, the latter of which are 
formed by radical-induced lipid autoxidation. Photo- 
trophic organisms, including higher plants, utilise ascor- 
bate peroxidase (APx) as a primary catalyst for the 
reduction of hydrogen peroxide and lipid hydroperoxides. 
However, unlike the freshwater cnidarian H. viridis [164], 
there is no evidence for transfer of APx-encoding genes to 
A. digitifera. The antioxidant enzymes SOD, CAT, GPx 
and APx are well characterised in the algal and animal 
partners of coral symbiosis (reviewed in [317]). Addition- 
ally, the coral genome has sequences encoding alkyl hy- 
droperoxide reductase, hydroperoxide lyase, phospholipid- 
hydroperoxide glutathione peroxidase, thiol peroxidase 
and multiple isoforms of peroxiredoxin, all of which func- 
tion in the detoxification of organo-hydroperoxides that 
are produced as a by-product of aerobic metabolism. Add- 
itionally, sulfiredoxin (Table 11) repairs peroxiredoxins 
when these enzymes are inhibited by over-oxidation [318]. 

Thioredoxins and glutaredoxins have important second- 
ary roles in regulating multiple pathways in many 
biological processes, including redox signalling of apop- 
totic pathways, which have been attributed to processes 
involved in coral bleaching [56]. Other enzymes that regu- 
late cellular thiol-disulfide homeostasis in this coral are 
monothiol glutaredoxin and protein-disulfide reductase. 
The coral genome encodes the ubiquitous thioredoxin sys- 
tem of antioxidant proteins (Table 11) that act as electron 
donors to peroxidases and ribonucleotide reductase (the 
latter not tabulated). By cysteine thiol-disulfide exchange, 
thioredoxins function as a protein thiol-disulfide oxi- 
doreductase [319]. In the thioredoxin system, thioredoxins 
are maintained in their reduced state by NADPH-depen- 
dent, flavoenzyme thioredoxin reductase [320]. Peptide- 
methionine (R)-S -oxide reductase can additionally rescue 
thioredoxin from oxidative inactivation by disulfide reduc- 
tion. Related glutaredoxins share many of the functions of 
thioredoxins but are reduced directly by glutathione, ra- 
ther than by a specific reducing enzyme, while in turn 
glutathione is kept in its native state by NADPH: glutathi- 
one reductase. 

In recent years there has been a particular focus on the 
role of ROS in coral bleaching, fuelled by dire prediction 
of future catastrophic episodes caused by environmental 
change affected by global warming [321]. Early predic- 
tions of coral bleaching were based principally on 
physical environmental parameters, rather than on the 
determination of the physiological state of coral 
populations to such conditions. While gene expression 
markers are being developed to monitor sub-bleaching 
levels of stress in situ (e.g., [261]), Kenkel et al. [322] 
opined that the current challenge for implementing 
expression-based methods lies in identifying coral 
genes demonstrating the most pronounced and consistent 
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Table 11 Antioxidant and redox-protective proteins in the predicted proteome of A digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


vl. 10918 


K04756 


Alkyl hydroperoxide reductase subunit D 


vl .1 1 551 


K03387 


Alkyl hydroperoxide reductase subunit F 


vl .07812 


K03594 


Bacterioferritin 


vl .21 362 [+ 4 other sequence copies] 


K00429 


Catalase (bacterial) 


vl .1 7525 [+ 4 other sequence copies] 


K03781 


Catalase (peroxisonal) 


v1 .23457; vl .01 71 3; vl .1 9228 


K04569 


Copper chaperone for superoxide dismutase 


vl .20153; v1. 20154 


K10528 


Hydroperoxide lyase 


v 1 . 1 9687; v1 . 1 9688; v 1 . 1 8796; v 1 . 1 8795 


K00522 


Ferritin heavy chain 


v 1.06441 


K03674 


Glutaredoxin 1 


Vl. 19449 


K03675 


Glutaredoxin 2 


vl .1 4929 [+ 5 other sequence copies] 


K03676 


Glutaredoxin 3 


vl.13285; v1 .03722; v1 .03688; vl .1 0496 


K00432 


Glutathione peroxidase 


v1 .131 74; v1 .13775; v1 .05473 


K00383 


Glutathione reductase (NADPH) 


v1 .14344; v1 .19399; v1 .01421 


K01920 


Glutathione synthase 


vl.02173 


K09238 


Metallothionein expression activator 


vl.09719; v1. 16134; vl. 18608 


K07390 


Monothiol glutaredoxin 


vl. 14890; vl. 17685 


K07305 


Peptide-methionine (R)-S-oxide reductase 


vl .14909 


K00435 


Peroxiredoxin 


vl.14106 


K13279 


Peroxiredoxin 1 


v 1.08691 


K11187 


Peroxiredoxin 5, atypical 2-Cys peroxiredoxin 


Vl.01410 


K11188 


Peroxiredoxin 6, 1-Cys peroxiredoxin 


v 1.03688 


K05361 


Phospholipid-hydroperoxide glutathione peroxidase 


vl .05148 


K05905 


Protein-disulfide reductase 


v1 .02922; v1 .22772; vl. 241 64 


K05360 


Protein-disulfide reductase (glutathione) 


vl .06810 


K12260 


Sulfiredoxin 


vl .01 7 1 3 [+ 4 other sequence copies] 


K04565 


Superoxide dismutase, Cu/Zn family 


vl .09974; vl. 20324 


K04564 


Superoxide dismutase, Fe/Mn family 


vl .02378 


K1 1 065 


Thiol peroxidase, atypical 2-Cys peroxiredoxin 


v1 .22324 [+ 7 other sequence copies] 


K03671 


Thioredoxin 1 


v1 .051 48; v1 .03230; vl. 20699 


K03672 


Thioredoxin 2 


vl .1 7881 [+ 5 other sequence copies] 


K13984 


Thioredoxin domain-containing protein 5 


v1. 04532; vl. 24501 


K09585 


Thioredoxin domain-containing protein 10 


vl .1 1551; vl.1 9049 


K00384 


Thioredoxin reductase (NADPH) 


vl. 10930 


K14736 


Transferrin 



stress response, preferably with a large dynamic range to 
enable reliable quantification. To this end, we offer in 
Table 11 the annotation of novel redox-related genes for 
examination as potential candidate biomarkers to monitor 
the physiological response of A. digitifera to environmen- 
tal stress. 

Proteins of cellular apoptosis 

Apoptosis is the signalling of programmed cell death 
(PCD) that occurs in multicellular organisms in response 



to cellular injury. A key feature of apoptosis is the acti- 
vation of endogenous endonucleases causing nuclear 
fragmentation, chromatin condensation and chromo- 
somal DNA fragmentation, which typically presents in 
affected cells by the morphological appearance of plasma 
membrane blebbing and cell shrinkage. Caspases and re- 
lated family member proteases are described as "execu- 
tioners" of apoptosis that on post-translational activation 
degrade the regulatory proteins that prevent DNA deg- 
radation. Fragmentation of nuclear DNA is one of the 



Dunlap et al. BMC Genomics 2013, 14:509 
http://www.biomedcentral.com/1471-2164/14/509 



Page 33 of 59 



hallmarks of apoptotic cell death that occurs by PCD 
stimuli in a wide variety of proliferating cells. NF-kB is a 
protein complex that controls the transcription of DNA 
that can induce the expression of nitric oxide synthesis 
(NOS) to produce NO that is a well-known promoter of 
the of the pro-apoptotic transcription factor p53 cell- 
cycle gatekeeper of the caspase cascade. In contrast to 
necrosis, which is the outcome of PCD, apoptosis me- 
diates the fragmentation of damaged cells, which by 
phagocytosis are removed or degraded in phagolyso- 
somes to spare surviving cells from the uncontrolled 
release of cytotoxic agents. Proteins of the caspase- 
mediated apoptotic cascade are regarded as products of 
constituent housekeeping genes that are necessary to 
maintain healthy multicellular function [323]. In the pro- 
gression of cnidarian bleaching, apoptotic pathways are 
activated [322-325], but not all corals that suffer 
bleaching are destined to die [326,327]. Coral survival 
has been attributed to having a high level of apoptotic 
protection at the onset of coral bleaching [328] and dur- 
ing post-bleaching recovery [329] by specific activation 
of anti- apoptotic Bcl-2 proteins in surviving cells [330]. 

Cnidarians have a complex apoptotic protein network 
that has exceptional ancestral complexity and is com- 
parable to that of higher vertebrates [331,332]. Cnidarian 
metamorphosis is tightly coupled with caspase-dependent 
apoptosis [333] and subsequent host-symbiont selection 
by post-phagocytic winnowing of Symbiodinium geno- 
types during the establishment of coral-dinoflagellate mu- 
tualism [334]. As expected, the coral genome of A. 
digitifera encodes multiple isoforms of genes that tran- 
scribe the caspase family of apoptotic effectors (Table 12). 
Included in this signalling pathway are the pro- and anti- 
apoptotic Bax/Bcl regulators and Bcl-2 athanogene (DNA- 
binding) activators of apoptosis. Notable in our annota- 
tion dataset are multiple genes that encode the protein 
domains of the apoptotic protease-activating factor 
(Apaf ) that triggers assembly of the apoptosome leading 
to caspase activation [335]. Additional to this arsenal of 
cell cycle regulators are the death associated protein-6 
(DAXX), a Fas-binding adaptor of c-Jun N-terminal kinase 
(JNK) activation [336], death-associated protein kinase 
(DAPK), a mediator of calcium/calmodulin-regulated Ser/ 
Thr kinase [337], and the programmed cell death 6-inter 
acting protein (PDCD6IP), which binds to PDCD-6 for 
execution of apoptosis via the caspase-3 pathway [338]. 
PDCD6IP activation of apoptosis is an enigma since 
PDCD-6 is not encoded in the coral genome, nor is 
caspase-3. Other cell cycle regulators are the p53 binding 
and p53-associated parkin-like proteins, and the activating 
TP53 regulating kinase protein and TP53 apoptosis ef- 
fector of TPS3 gene expression. 

Our genome annotation reveals 73 sequence matches for 
expressing the Apaf protein domain that, in conjunction 



with a high copy number for expressing caspase-8 (28 pro- 
tein sequence matches), may enhance coral survival during 
embryogenesis by suppressing receptor-induced protein 
kinase (45 sequence matches) during early development 
[339]. The most conserved function of the CAPS2/RIPK 
adaptor (45 sequence matches) encoded in the coral gen- 
ome is its essential regulation of apoptosis [340]. We find a 
wide repertoire of genes that additionally encode proteins 
that mediate apoptosis (Table 12). Amongst these are the 
calpain Ca + -sensing family of proteins that initiate the sig- 
nalling of apoptotic pathways [341]. There are 79 matches 
to sequences that encode the tumor necrosis Fas superfam- 
ily member 6 (TNFRSF6) receptor, which coupled with the 
death domain (FADD) protein is a cell signalling mediator 
for recruitment of caspase-8 that activates the apoptotic 
cysteine protease cascade. Coincident in the genome are 
67 sequences encoding the leucine-rich repeat and death 
domain-containing (LRDD) adaptor that, by interacting 
with other p53-inducible death domain-containing (PIDD) 
proteins such as FADD, induces the caspase-2 pathway of 
apoptosis in response to DNA damage [342]. Elements of 
the NF-kB signalling pathway of cnidarians are highly con- 
served traits [343], which includes the caspase cascade and 
the pro-apoptotic and anti-apoptotic Bcl-2 family of pro- 
teins [344] . The coral genome of A. digitifera encodes the 
pleiotropic nuclear factor NF-kB pl05 subunit, and aston- 
ishingly there are 212 sequence matches to the NF-kB 
inhibitor-like protein 2 domain with fewer matches to the 
NF-kB inhibitor-like protein 1 and NF-kB family inhibitors 
alpha, delta and epsilon. Evident in our genome annota- 
tion is the tumor necrosis factor-alpha induced protein 
3 (TNFAIP3), a cytokine produced by activated (inflam- 
matory) macrophages. Although TNF cytokines are a 
major extrinsic mediator of cellular apoptotic pathways, 
the precise function of the superfamily members of 
TNF ligands and receptors (Table 12) remains elusive in 
coral symbiology. 

Microbial symbiosis and pathogenicity 

It is well established that corals associate with a vast 
consortia of microbes, including phototrophic symbionts 
(Symbiodinium spp.) and other eukaryotic microbionts, 
cyanophytes, heterotrophic bacteria, archaea and viruses 
[345]. Corals harbour diverse and abundant prokaryotic 
communities with distinct populations residing in separ- 
ate habitats of the host skeleton, tissues and surface 
mucus layer (reviewed in [203]). Microbial populations 
are dominated by a few coral-specific taxonomic traits 
[346], but the majority of the population comprises a 
high number of taxonomically diverse, low-abundance 
ribotypes [347] with much of the diversity within the 
coral microbiome belonging to the "rare" biosphere 
[348,349]. The coral microbiome is vital to the nutrition 
and health of the holobiont [350] and contributes 
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Table 12 Proteins of cellular apoptosis in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


v1 .1 7521 ; vl .02505; v1 .20702; vl .05077 


K02159 


Apoptosis regulator BAX (BCL2-associated) 


vl. 05086; vl. 20659 


K02161 


Apoptosis regulator BCL-2 


vl .1 7522; v1 .001 81 ; v1 .1 081 7; vl .20703 


K02163 


Apoptosis regulator BCL-W 


v1 .05147 [+ 6 other sequence copies] 


K12875 


Apoptotic chromatin condensation inducer 


vl .22264 [+ 72 other sequence copies] 


K02084 


Apoptotic protease-activating factor (Apaf) 


vl .1 7326; vl .20305; v1 .1 1 586 


K09555 


BCL2-associated athanogene 1 


vl .08601 


K09558 


BCL2-associated athanogene 4 


vl .02839 


K09559 


BCL2-associated athanogene 5 


vl.01518 


K1 3087 


BCL2-associated transcription factor 1 


vl .20278; vl .001 72; vl. 07858 


K14021 


BCL-2 homologueous antagonist/killer 


vl .09624 


K02561 


BCL2-related (ovarian) killer protein 


vl. 17749 


K08573 


Calpain-3 


v1 .00595; vl. 1 4671 ; vl. 00040 


K08574 


Calpain-5 


vl .00040 


K08575 


Calpain-6 


vl. 19153; vl.1 7749 


K08576 


Calpain-7 


vl. 15226 


K04740 


Calpain-12 


vl .02951 


K08582 


Calpain-15 


V1 .1 1 167; vl .06681 ; vl .20230; vl .01 376 


K08585 


Calpain, invertebrate 


v 1 .03 1 2 7 [+ 6 other sequence copies] 


K08583 


Calpain, small subunit 1 


v1. 1 7229; vl. 00023; vl. 09976 


K02186 


Caspase 2 


vl. 11989 [+ 5 other sequence copies] 


K04397 


Caspase 7 


v1 .02756 [+ 27 other sequence copies] 


K04398 


Caspase 8 


vl.01818 


K04399 


Caspase 9 


v1.00817 [+ 4 other sequence copies] 


K04400 


Caspase 10 


vl .02005 


K04741 


Caspase 12 


v1.00818 [+ 11 other sequence copies] 


K04489 


Caspase apoptosis-related cysteine protease 


vl.13260 


K07367 


Caspase recruitment domain-containing protein 1 1 


v1 .06297 [+ 44 other sequence copies] 


K02832 


CASP2 and RIPK1 adaptor with death domain 


vl.21531 


K02308 


Death-associated protein 6 (DAXX) 


vl .09448; vl.1 5529; v1. 201 64 


K08803 


Death-associated protein kinase (DAPK) 


vl.231 10; vl. 14222; vl .03658 


K12366 


Engulfment and motility protein 1 (phagocytosis/apoptosis) 


v1. 18448 [+ 78 other sequence copies] 


K02373 


Fas (TNFRSF6)-associated via death domain (FADD) 


vl .24288 [+ 66 other sequence copies] 


K10130 


Leucine-rich repeats and death domain-containing protein 


vl .20620 


K04734 


NF-kappa-B inhibitor alpha 


vl.01706 


K14214 


NF-kappa-B inhibitor delta 


vl .1 0378; vl .10729; 1 .05609; vl .05609 


K05872 


NF-kappa-B inhibitor epsilon 


Vl.1 7893; vl .2241 9; vl .00700; vl .0841 5 


K09256 


NF-kappa-B inhibitor-like protein 1 


vl.04158 [+ 21 1 other sequence copies] 


K09257 


NF-kappa-B inhibitor-like protein 2 


v1 .05320; v1 .06979; v1 .04467; vl .21371 


K02580 


Nuclear factor NF-kappa-B pi 05 subunit 


v1. 20334; vl. 22743 


K1 1 970 


p53-Associated parkin-like cytoplasmic protein 


v1 .1 4920; vl .1 1 864; vl .15271; vl.1 1865 


K06643 


p53-Binding protein 


vl .04289 


K06708 


Programmed cell death 1 ligand 2 


v1 .05882 [+ 7 other sequence copies] 


K12200 


Programmed cell death 6-interacting protein (PDCD61P) 
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Table 12 Proteins of cellular apoptosis in the predicted proteome of A. digitifera (Continued) 



vl. 1 0959; vl. 04994 


K04727 


Programmed cell death 8 apoptosis-inducing factor 


vl.16714 


K06875 


Programmed cell death protein 5 (PDCD-5) 


vl .1 31 12 


K03171 


Tnfrsfla-associated via death domain 


v1. 24655; vl. 12385 


K10136 


TP53 apoptosis effector 


vl .09087 


K08851 


TP53 regulating kinase 


vl. 05030; vl .07044 


K1 1859 


Tumor necrosis factor, alpha-induced protein 3 


vl .22799 


K04389 


Tumor necrosis factor ligand superfamily member 6 


vl .05776 


K05470 


Tumor necrosis factor ligand superfamily member 7 


vl.13754 


K05472 


Tumor necrosis factor ligand superfamily member 9 


vl .21 776 [+ 6 other sequence copies] 


K04721 


Tumor necrosis factor ligand superfamily member 10 


vl .04001 


K05473 


Tumor necrosis factor ligand superfamily member 1 1 


vl. 19776 


K05474 


Tumor necrosis factor ligand superfamily member 12 


vl .09015; vl .14041 


K03158 


Tumor necrosis factor receptor superfamily member 1 A 


vl .07010 


K05141 


Tumor necrosis factor receptor superfamily member 1B 


vl. 19735 


K05142 


Tumor necrosis factor receptor superfamily member 4 


vl.13754 


K03160 


Tumor necrosis factor receptor superfamily member 5 


vl .22577 


K05143 


Tumor necrosis factor receptor superfamily member 6B 


vl .20003 


K05144 


Tumor necrosis factor receptor superfamily member 7 


v1 . 23750; vl. 1 7970; vl. 19022 


K05146 


Tumor necrosis factor receptor superfamily member 9 


vl .07527 


K05148 


Tumor necrosis factor receptor superfamily member 1 1B 


vl.10221 


K05151 


Tumor necrosis factor receptor superfamily member 13C 


vl .14826; vl.01 054 


K05152 


Tumor necrosis factor receptor superfamily member 14 


vl.09514 


K05156 


Tumor necrosis factor receptor superfamily member 19 


vl.01640 


K05161 


Tumor necrosis factor receptor superfamily member 26 


v1 . 08207; vl. 1 6237; vl. 14824 


K10133 


Tumor protein p53-inducible protein 3 



significantly to the protection of coral reef ecosystems 
against the detrimental effects of organic enrichment 
[351,352]. One emerging threat to coral reefs is the out- 
break of infectious diseases (reviewed in [353]). Although 
highly subjective and with little experimental evidence to 
date, the coral probiotic hypothesis [354] suggests that the 
coral prokaryotic microbiome can adapt to changing en- 
vironmental conditions by selective microbial reorganisa- 
tion to impart greater resistance to disease and pathogen- 
mediated bleaching [355]. Whether the coral microbiome 
can respond to changing environmental conditions more 
rapidly than by host genetic mutation and selection based 
on contemporary phenotypic evolution on ecological 
time-scales [356], is a topic of current debate [357]. 

Corals, like other invertebrates, have an innate im- 
mune system based on self-histocompatibility recogni- 
tion (reviewed in [358]), but to date few adaptive 
components have been identified [359]. Corals do not 
produce antibodies and thus lack a true adaptive immune 
system. Nonetheless, corals once susceptible to infection 
and bleaching caused by a specific bacterial agent can be- 
come immune to the invading pathogen by a phenomenon 



termed "experience-mediated tolerance", a precept of the 
hologenome theory of evolution [360], although how this 
process occurs is largely unknown. In our annotation of 
the genome sequence of A. digitifera we uncovered genes 
encoding the expression of disease resistance proteins 
(Table 13), two of which match the plant RPM1 and RPS2 
pathogen resistance proteins that guard against disease by 
binding with pathogen avirulence receptors [360,361]. Sig- 
nificant also is a gene to express the pathogenesis-related 
protein PR-1 (29 sequence domain matches) that is indu- 
cible in plants for systemic acquired resistance to patho- 
genic invasion [362]. We uncovered also multiple genes 
encoding the expression of myeloperoxidase (MPO) 
enzymes. MPOs produce hypochlorous acid from hydro- 
gen peroxide and chloride ion (requiring heme as a co- 
factor), and it oxidizes tyrosine to the tyrosyl radical using 
hydrogen peroxide as an oxidizing agent. Hypochlorous 
acid and tyrosyl radicals are strong cytotoxic agents that 
in higher organisms are used as a primary defence by 
neutrophils to protect against invading pathogens. 
Phenoloxidase (tyrosinase) activity is reported to contribute 
to the innate defence system of A. millepora and Porites sp. 
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Table 13 Microbial symbiosis and pathogenicity proteins in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


vl.06126 


K13061 


Acyl homoserine lactone synthase 


v1 .19990 


K01372 


Bleomycin hydrolase 


vl .00209; vl .061 78 


K03587 


Cell division protein Ftsl (penicillin-binding protein 3) 


vl. 18860 


K1 3458 


Disease resistance protein 


v1. 1 6231 ; vl. 00374; vl. 081 91 


K13457 


Disease resistance protein RPM1 


vl. 13482 [+ 4 other sequence copies] 


K13459 


Disease resistance protein RPS2 


vl .07889 


K12090 


Cag pathogenicity island protein 5 


vl .24345 


K12091 


Cag pathogenicity island protein 6 


vl. 1 8924; vl. 17622 


K12093 


Cag pathogenicity island protein 8 


vl .05278 


K1 2096 


Cag pathogenicity island protein 1 1 


vl .02083 


K12104 


Cag pathogenicity island protein 19 


vl. 12907 


K12109 


Cag pathogenicity island protein 24 


v1. 00209; v1. 061 78 


K03587 


Cell division protein Ftsl (penicillin-binding protein 3) 


vl. 13874 


K07259 


Carboxy/endopeptidase (penicillin-binding protein 4) 


vl.12514; vl.09758 


K04127 


sopenicillin-N epimerase 


vl.21332 


K04126 


Isopenicillin-N synthase 


vl .07742 


K02547 


Methicillin resistance protein 


v1 .1 7478; v1 .1 6977; vl .10289; vl .1 9907 


K04079 


Molecular chaperone HtpG (anti-bacterial) 


vl .08255 


K13651 


Motility quorum-sensing regulator, GCU-specific toxin 


v1 .14792 [+ 7 other sequence copies] 


K10789 


Myeloperoxidase 


v1 .02333 [+ 26 other sequence copies] 


K1 3449 


Pathogenesis-related protein 1 


vl.05017 


K03693 


Penicillin-binding protein 


vl. 17507 


K12556 


Penicillin-binding protein 2X 


vl. 13874 


K07259 


Penicillin-binding protein 4 


vl. 16655 


K02171 


Penicillinase repressor 


vl. 14688 


K15126 


Type III secretion system cytotoxic effector protein 


vl .20647 


K03980 


Virulence factor, integral membrane protein 


vl. 18964 


K03810 


Virulence factor, oxidoreductase domain 



[363] via activation of the melanin-signalling pathway that 
is induced in response to coral bleaching and localised dis- 
ease [364,365]. Three genes of A. digitifera encode tyro- 
sinase enzymes (data not tabulated) to account for the 
phenoloxidase activity reported in corals. 

The genome of A. digitifera also reveals homologues 
of genes that promote bacterial pathogenicity (Table 13), 
including virulence factors that are expressed and ex- 
creted by invading pathogens (bacteria, viruses, fungi 
and protozoa) to inhibit certain protective functions of 
the host. Such are the bacterial Type III cytotoxic ef- 
fector protein and multiple Type IV Cag pathogenicity 
island proteins encoded in the coral genome. Many 
Gram-negative bacteria utilize Type III secretion pro- 
teins, which are regulated by quorum sensing, to deliver 
cytotoxic effector proteins into eukaryote host cells dur- 
ing infection. Cag (cytotoxin-associated) pathogenicity 



island (PAI) proteins are encoded by mobile genetic ele- 
ments of the Type IV system secreting both proteins and 
large nucleoprotein complexes [366] that may be trans- 
ferred between prokaryotes to enhance selected traits of 
virulence [367]. Our annotation reveals genes encoding 
six pathogenicity island proteins (Table 13) with similar- 
ity to the Cag PAI proteins of the human Heliobacter 
pylori, an infectious bacterium causing peptic ulcers that 
may lead to the development of stomach cancer. While 
many properties of Type III and IV secretion system 
proteins have been well characterized in bacteria, the 
functional purpose of homologous genes in A. digitifera, 
if expressed, are unknown. 

The genome of A. digitifera contains genes of bacterial 
origin that encode the motility quorum-sensing regulator 
of the GCU-specific mRNA interferase toxin and acyl 
homoserine lactone synthesis used for the communication 
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of quorum sensing between bacteria to enable the coord- 
ination of group behaviour based on collective population 
density. Apparent in our annotation (Table 13) is a wide 
array of microbial penicillin-binding proteins (PBPs) that 
have an affinity for p-lactam antibiotics that by binding to 
PBPs prevent bacteria from constructing a cell wall. There 
are genes also to enhance antibiotic resistance, including 
potential expression of a penicillinase repressor, a methi- 
cillin resistance protein and bleomycin hydrolase (cysteine 
peptidase). Additionally, isopenicillin-N synthase and an 
isopenicillin-N epimerase, both of which catalyse key steps 
in the biosynthesis of penicillin and cephalosporin antibi- 
otics, are encoded in the coral genome. Taken as a whole, 
we demonstrate an extensive presence of ancient non- 
metazoan genes that are maintained in the genome of A. 
digitifem, as is reported in the genomes of A. millepora and 
the anemone N. vectensis [368]. Recent thought on genome 
evolution places these ancestral conserved domains as 'or- 
phan' or 'taxonomically restricted' genes [352,369,370], 
rather than acquired later by horizontal gene transfer. 
There is, of course, little knowledge of how or when, if at 
all, these non-metazoan genes are expressed or even their 
function to mediate pathogenicity in the coral holobiont. 

Proteins of viral pathogenicity 

Marine viruses were of minor interest until 1989, when 
it was realised that virus-like particles (VLPs) are the 
most abundant biological entities to occupy aquatic en- 
vironments with variable numbers reaching ~10 8 VLPs 
ml" [371]. Typically, VLPs surpass the number of mar- 
ine bacteria by an order of magnitude in coastal waters 
[372]; their diversity is extremely high and many are spe- 
cific to the marine environment [373,374]. Significant 
VLP numbers are reported from the surrounding waters 
of oceanic coral reef atolls [375], in waters flowing 
across the reef substratum [376] and in samples taken 
within the close vicinity of coral colonies [377,378]. The 
viral load within the surface microlayer of scleractinian 
corals is enumerated as being 10 7 -10 8 VLPs mL' 1 [379] 
and, based on VLP morphological diversity, is attributed 
to infecting various microbial hosts (bacteria, archaea, 
cyanobacteria, fungi and algae) residing within the coral 
mucus [380]. VLPs have been observed in the epidermal 
and gastrodermal tissues of corals and occasionally occur 
in the mesogloea [381]. Latent viruses were found to infect 
Symbiodinium isolated from several scleractinian corals 
[382-384] with a preponderance of eukaryotic algae- 
infecting phycodnaviruses suggested [385]. A wide range 
of bacteriophage and eukaryotic virus families have been 
identified within scleractinians using metagenomic ana- 
lyses [207,386-388], with bacteriophages being by far the 
most abundant entities (Wood-Charlson EM, Weynberg 
KD, Suttle CA, Roux S, van Oppen MJH: Methodological 
biases in coral viromics, submitted). 



The importance of the coral-virus interactome in blea- 
ching and disease (reviewed in [185,389]) is founded on 
reports showing that VLP abundances are higher in the 
seawater immediately surrounding diseased compared to 
that of healthy corals [378], that latent viruses are induced 
by heat stress in symbiotic dinoflagellates of the sea anem- 
one Anemonia virdis [382] and the coral Pavona danai 
[383], and that UV exposure induces a latent virus-like 
infection in cultured Symbiodinium [187]. Quantitative 
454 pyrosequence analysis of the coral Pontes, compressa 
on exposure to reduced pH, elevated nutrients or thermal 
stress showed that the abundance of its viral consortia var- 
ied across treatments, but notably a novel herpes-like virus 
increased by up to 6 orders of magnitude on exposure to 
abiotic stress [387], although some caution may be 
warranted in assessing the reliability of such determina- 
tions [Wood-Charlson et al., submitted]. Unexpectedly, 
the proteome of an endosymbiont-enriched fraction of the 
coral Stylophom pistillata showed a significant 114-fold 
increase in a viral replication protein on thermal bleaching 
[39], which is consistent with the finding of VLP induction 
in P. compressa by similar treatment [387]. 

General aspects of histocompatibility [390-393] and the 
genetic structure of innate immune receptors of the 
Cnidaria [363,394-401], including the immune response 
effected by coral disease and bleaching [364,402], have 
been examined extensively, hence further elaboration here 
is unnecessary. Instead, we focus on proteins that directly 
regulate the pathogenicity of coral-associated microbes 
and viruses. The A. digitifera genome encodes protein 
homologues having either putative antiviral and virus- 
promoting activities (Table 14). These homologues in- 
clude the antiviral "superkiller" helicase SKI2 protein 
that acts by blocking viral mRNA translation [403] and, 
together with the superkiller proteins SKI3 (69 se- 
quence alignments) and SKI8 of the exosome complex, 
function in a 3 '-mRNA degradation pathway [404]. The 
coral genome encodes also three exoribonuclease (RNase) 
enzymes (XRN, XRN2 and RNB) with antiviral RNA- 
degrading properties [405,406]. Annotation of the coral 
genome reveals homologues to four interferon proteins 
(IFNB, IFNG, IFNW1 and IFNT1). Interferons are po- 
tent and selective antiviral cytokines [407], which are in- 
duced by viral infection or by sensing dsRNA, a by- 
product of viral replication, leading to the transcription of 
interferon-stimulated genes whose products have antiviral 
activities and others having antimicrobial, antiprolifera- 
tive/antitumor or immumomodulatory effects [408,409]. 
Included in the coral antivirus defence system are three 
members of the interferon regulatory transcription factor 
(IRF1, IRF2 and IRF8) family proteins. IRF1 and IRF2 are 
transcriptional activators of cytokines and other target 
genes [410]; IRF1 is known to trans-activate the tumor 
suppressor protein p53 [411] while IRF2 regulates post- 
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Table 14 Regulatory and related proteins of viral pathogenicity in the predicted proteome of A. digitifera 





Gene sequence 


KEGG Orthology 




vl .20647; v1. 061 88; v1 .21287 


K12599 


v1. 


18443 [+ 40 other sequence copies] 


K12807 


V 


1.06263 [+ 6 other sequence copies] 


K04725 




vl.14355 


K08731 


V 


1.04171 [+ 7 other sequence copies] 


K10586 




vl .1 2348; v1 .01 945; vl .1 661 2 


K06731 


V 


1.01539 [+ 7 other sequence copies] 


K04012 




vl. 17305 


K04462 




/1. 1496 [+ 4 other sequence copies] 


K12618 


vl 


.22746; vl .1 9002; v1 . 1 2850; vl .2 1 2 1 6 


K12619 




vl .09005 


K01 147 


vl 


.22793; v1 .1 2978; v1 .1 9008; vl .20838 


K09239 


V 


1.02776 [+ 7 other sequence copies] 


K1 5046 




v1. 09829; vl. 13077 


K05415 


vl 


.1 1946; vl.21512; vl.l 1221; vl.1 1927 


K04687 




vl.21512 


K14140 




vl. 11946 


K05133 


V 


1.01539 [+ 4 other sequence copies] 


K04012 


vl 


.10782; v1 .23797; vl .1 71 19; vl .03221 


K1 2647 




vl .06274; v1. 1 5849; v1. 05943 


K06566 




vl. 21 327; vl. 24081 


K05440 




vl.l 1817 


K09444 




vl.1 1816; vl.07639 


K10153 




vl.l 1421 


K10155 




V1.02158 


K12579 




vl. 15947 


K05442 




vl .22825; v1 .08034; vl .08520 


K05788 




vl. 14899 


K08220 




vl .04514; vl .04513; vl.16929 


K1 2648 


vl 


.17718; vl .08002; v1 .08001 ; vl .22382 


K06081 




v1 .21413; v1 .06637 


K06531 


vl 


.1 1740; vl.21467; vl.l 1410; vl . 17135 


K06592 




vl. 15077 


K06593 


vl. 


04158 [+ 68 other sequence copies] 


K1 2600 


V 


1.18238 [+ 4 other sequence copies] 


K12601 



transcriptional induction of NO synthase [412]. Con- 
versely, IRF8 is an interferon consensus sequence- 
binding protein that is a negative (interference) regulator 
of enhancer elements common to interferon-inducible 
genes [413]. The coral genome additionally includes an 
interferon-stimulated 20 kDa protein (ISG20) RNase spe- 
cific to deactivation of singled-stranded RNA viruses 
[414]. The coral genome encodes several interferon- 
inducible proteins, notably interferon gamma induced 



Encoded protein description 

Antiviral helicase SKI2 

Baculoviral IAP repeat-containing protein 1 (BIRC1) 

Baculoviral IAP repeat-containing protein 2/3/4 (BIRC2/3/4) 

Baculoviral IAP repeat-containing protein 5 (BIRC5) 

Baculoviral IAP repeat-containing protein 6 (BIRC6) 

Bone marrow stromal cell antigen 2 (antiviral BST2) 

Complement component receptor 2 (CR2) 

Ecotropic virus integration site 1 protein (EVI 1 ) 

5-3' Exoribonuclease 1 (antiviral XRN1) 

5-3' Exoribonuclease 2 (antiviral XRN2) 

Exoribonuclease II (antiviral RNB) 

HIV virus type I enhancer-binding protein (HIVEP) 

Influenza virus NS1 A-binding protein (NS1A-BP) 

Interferon beta (IFNB) 

nterferon gamma (IFNG) 

nterferon gamma induced GTPase (ITGP) 

Interferon gamma receptor 2 (IFNGR2) 

nterferon-induced GTP-binding protein Mx1 

nterferon-induced helicase C domain-containing protein 1 

nterferon induced transmembrane protein (IFITM1) 

Interferon, omega 1 (IFNW1) 

nterferon regulatory factor 1 (IRF1) 

nterferon regulatory factor 2 (IRF2) 

nterferon regulatory factor 8 (IRF8) 

Interferon-stimulated gene 20 kDa protein (ISG20) 

nterferon tau-1 (IFNT1) 

Integration host factor subunit beta (IHFB) 

MFS transporter, FLVCR family virus subgroup C receptor 

Mitochondrial antiviral-signalling protein (MAVS) 

Poliovirus receptor-related protein 1 (PVRL1) 

Poliovirus receptor-related protein 2 (PVRL2) 

Poliovirus receptor-related protein 3 (PVRL3) 

Poliovirus receptor-related protein 4 (PVRL4) 

Superkiller protein 3 (antiviral SHI3) 

Superkiller protein 8 (antiviral SHI8) 

GTPase (IGTP) that accumulates in response to IFNB 
[415], the interferon-induced GTP-binding protein Mxl 
that is a key element of host antiviral defence [416], the 
interferon-induced helicase C domain-containing pro- 
teinl (aka MDA-5), which is an immune receptor that 
senses viral dsRNA to activate the interferon antiviral- 
response cascade [417] and the interferon-induced trans- 
membrane protein (IFITM1) that suppresses cell growth 
[418]. The coral genome encodes the interferon-gamma 
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receptor 2 (IFNGR2) transmembrane protein that acti- 
vates downstream signal transduction cascades that con- 
trol cell proliferation and apoptosis [419]. Encoded also 
is a homologue of the human bone marrow stromal cell 
antigen 2 (BST2) that inhibits retrovirus infection by 
preventing VLP release from infected cells [420]. Add- 
itionally encoded is a mitochondrial antiviral-signalling 
protein (MAVS) that triggers the host immune response 
by activation of the nuclear transcription factor NF-kB 
and the interferon regulatory transcription factor IRF3 
which coordinates the expression of type-1 interferons 
such as IFNB [421]. 

The coral genome encodes a full set of baculoviral IAP 
repeat-containing proteins BIRC 1-6 (Table 14). The IAP 
(inhibitor of apoptosis) family proteins were first identi- 
fied secreted by baculovirus to protect infected cells 
from death in the progression of viral replication [422]. 
Expressed by most eukaryotic organisms (reviewed in 
[423]), their IAP function is presumably conserved in 
corals. The coral genome encodes a full set of poliovirus 
receptor-related proteins (PVRL1-4) of the immuno- 
globulin superfamily, which bind and transport herpesvi- 
ruses at the cellular membrane in the establishment of 
latent infections (reviewed in [424]). Encoded also is a 
complement component (3d/Epstein Barr virus) receptor 
2 (CR2) protein that binds to the Epstein-Barr virus Her- 
pes viridae with antigenic activity for disease prevention 
[425]. Another encoded protein is a homologue of the 
human immunodeficiency virus type 1 (HIV-1) enhancer- 
binding protein (HIVEP; aka EBP1) that attaches to the 
HIV long terminal repeat (LTR) region to activate tran- 
scription via the HIV LTR [426]. Present in the coral 
genome is also a homologue of the influenza virus non- 
structural binding protein NS1A-BP that interacts with 
the NS1 virulence factor of the influenza A virus Orthomy- 
xoviridae to interfere with NS1 -inhibition of pre-mRNA 
splicing within the host nucleosome [427]. NS1A-BP in- 
hibits NSlA-mediated disruption of the host immune re- 
sponse caused by restricting interferon production and the 
antiviral effects of IFN-induced proteins [428]. The gen- 
ome of A. digitifera encodes an integration host factor 
subunit beta (IHFB), first discovered as a host factor for 
bacteriophage \ integration of mobile genetic elements, 
that in E. coli is involved in multiple processes of DNA 
replication, site-specific recombination and gene expres- 
sion [429]. A homologue of the MFS transporter feline 
leukemia virus subgroup C receptor (FLVCR) cell sur- 
face protein is encoded in the coral genome, which in 
cats confers susceptibility to FeLV-C infection [430]. 
Encoded also is a viral integration site 1 (EVI1) that 
in humans is an oncogenic transcription factor, often 
activated by viral infection, to cause proliferation of inva- 
sive tumours [431]. Arguably, these homologue proteins 
typically expressed in such distantly related species may 



have similar relevance in viral interactions of the coral 
holobiome. 

How these regulatory proteins and viral receptors 
interact and respond to viral infection in corals is yet to 
be realised. The absence of virion-specific sequences 
(e.g. for nucleic acid replication or capsid structure) sug- 
gests that proviral DNA is absent from the coral gen- 
ome, or it may be an artefact of the limited number of 
marine viral sequences deposited in public databases. 
Discovery of viral activity through proteomics [39] may, 
therefore, suggest that viral proteins are synthesised 
from a lytic infection, but this requires confirmation. 

Toxins and venom 

A review of protein sequences deposited in the UniProt 
database in October 2012 shows that there are 150 
known cnidarian toxins. These toxins have diverse bio- 
logical activities (neurotoxins, pore-forming cytolysins 
and venom phospholipases) used to capture prey and for 
protection against predators [432] that are best charac- 
terised in sea anemones (Actiniaria) with 141 sequences 
deposited [433,434]. The cytotoxin MCTx-1 isolated 
from the Net Fire Coral Millepom dichotoma is the only 
toxin from a coral deposited in Uniprot (accession num- 
ber A8QZJ5). However, our initial examination of the 
predicted proteome of A. digitifera shows 18 proteins with 
similarity to bacterial toxins and associated regulatory 
proteins (Table 15). Unlike reports from proteomic exam- 
ination of the coral S. pistillata [39] and nematocysts 
(stinging organelles) of the jellyfish Olindias samba- 
quiensis [435], Tamoya haplonema, Chiropsalmus quadru- 
manus, Chrysaora lactea (PF Long et al., pers comm), by 
sea anemones [434] and by the highly dangerous box jelly- 
fish Chironex fleckeri [436,437], no venoms typical of 
higher animals were found in the A. digitifera genome. 
This was because our annotation was carried out using 
the KEGG database (release v58 [53]) to relate A digitifera 
protein sequences to KEGG orthologues. The KEGG data- 
base is a collection of proteins from well characterised and 
ubiquitous biochemical pathways. Animal venoms, how- 
ever, are highly specialised proteins for which this release 
of the KEGG database does not contain any described 
orthologues. 

KEGG orthology-based annotation of the A. digitifera 
genome reveals genes encoding protein homologues of 
10 bacterial toxins, 7 regulatory toxin proteins and a 
botulinum protein substrate (Table 15). Of the 9 toxin 
homologues, one with similarity to anthrax edema factor 
(EF) adenylate cyclase (CyaA) is one of three proteins 
that comprise the anthrax toxin of Bacillus anthracis, 
the other two being a protective antigen (PA) and lethal 
factor (LF). Without the LF protein, anthrax CyaA has 
no known toxic effects in animals [438], although the 
EF protein does play an important role in disabling 
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cellular functions vital for microbial host defences 
[439]. The A. digitifera genome encodes a secretion viru- 
lence factor exotoxin A-like protein produced by Pseudo- 
monas aeruginosa, which for this bacterium affects local 
tissue damage, bacterial invasion and immunosuppression 
within their eukaryote host [440] with pathogenicity similar 
to that of the diphtheria toxin [441]. Another encoded pro- 
tein is a murine-like toxin (Ymt) produced by the entero- 
bacterium Yersinia pestis, which is the causative agent 
responsible for transmission of the notorious bubonic 
plague [442]. Additionally, two hemolytic enterotoxins 
similar to NheA and NheBC produced by Bacillus cereus 
[443], an enterotoxin (EntA) similar to that of Staphylococ- 
cus aureus [444], a Shiga-like enterotoxin (StxB) produced 
by Shigella dysenteria, the diarrhoea-causing toxin A/B 
(TcdAB) such as that secreted by Clostridium difficile 
[445], and a protein similar to the zonula occludens (tight 
junction) enterotoxin (Zot) secreted by Vibrio cholera 
[446] are encoded in the A. digitifera genome. Within the 
predicted proteome is also a homologue of the vacuolating 
cytotoxin (VacA) produced by Helicobacter pylori that col- 
onises the gastric mucosa of the human stomach epithe- 
lium [447]. 

Although a direct homologue of the cholera toxin (CT) 
was not found encoded in the A. digitifera genome 
(Table 15), a protein similar to its transcriptional activator 
ToxR was. ToxR not only controls the expression of CT in 
Vibrio cholera [448], but also a co-regulated pilin (TcpA) 
protein that is under control of the ToxR regulon cascade 
[449]. Bacterial TcpA protein is assembled into toxin- 
coregulated pili that induce the transfer of DNA by hori- 
zontal exchange of genetic material during conjugation 
[450]. TcpA and two toxin co-regulated biosynthetic pro- 
teins (Tcpl and Tcps) of the bacterial virulence-associated 
pilus appendage [451] are encoded in the coral genome. 
Entrained also are the motility quorum-sensing interference 
regulator MsqR and its transcriptional regulator MsqA that 
in Eschericia coli controls biofilm formation by inhibiting 
quorum-sensing motility, and together the MqsR/MqsA 
complex represses the lethal cold shock-like protein cspD 
gene [452] that on expression impairs DNA replication 
[453]. The A. digitifera genome likewise encodes a Type III 
secretion system T3SS cytotoxic effector (BteA) protein 
[454] that in Gram-negative invasive bacteria is translocated 
into host cells to suppress innate immunity to enhance 
virulence [455,456]. However, the ecophysiological signifi- 
cance of these toxigenic proteins and allied regulators, if in- 
deed expressed by the coral genome, is unknown. 

In addition to using the KEGG database, we undertook a 
BLAST search of the predicted proteome of A. digitifera 
against peptide sequences for all animal venoms using the 
annotated UniProtKB/Swiss-Prot Tox-Prot program [457]. 
This search revealed a large number of accession hits from 
the predicted proteome, although these are unlikely to be 



Table 15 Proteins homologous to bacterial toxins in the 
predicted proteome of A. digitifera 



Gene sequence KEGG Orthology Encoded protein description 



vl.20214 


K11029 


Anthrax edema toxin adenylate 
cyclase (CyaA) 


v1 .17686 


K10921 


Cholera toxin transcriptional 
activator (ToxR) 


vl.13017 


K11020 


Exotoxin A (ToxA) 


v1 .23507 


K13655 


HTH-type transcriptional 
regulator (MsqA) antitoxin for 
MqsR 


vl .21 184 


K1 1 009 


Murine toxin (Ymt) 


v1 .04313 


K1 1033 


Non-hemolytic enterotoxin A 
(NheA) 


vl .08011 


K1 1 034 


Non-hemolytic enterotoxin B/C 
(NheBC) 


vl .08255 


K13651 


Motility quorum-sensing 
regulator (MqsR) interferase 
toxin 


vl .15986 


K11059 


Probable enterotoxin A (EntA) 


vl. 13046 


K04392 


Ras-related C3 botulinum toxin 
substrate 1 (Rac1) 


vl .13966 


K11007 


Shiga toxin subunit B (StxB) 


vl .23958 


K1 1 063 


Toxin A/B (TcdAB) 


vl .21 1 74 


K10930 


Toxin co-regulated pilin (TCP) 


vl .05802 


K10961 


Toxin co-regulated pilus 
biosynthesis protein I (Tcpl) 


v1 .21783 


K1 0964 


Toxin co-regulated pilus 
biosynthesis protein S (TcpS) 


vl .14688 


K15126 


Type III secretion system 
cytotoxic effector protein 
(BteA) 


vl .05520 


K11028 


Vacuolating cytotoxin (VacA) 


v 1.06590 


K1 0954 


Zona occludens toxin (Zot) 



true multiple copies given that the genome sequence has 
yet to be completely assembled. However, just taking a 
single accession number from each annotation reveals a 
complex array of 83 toxins that represents the predicted 
venom of A. digitifera (Table 16); UniProt BLAST E-values 
are given in Additional file 1: Table S16b. These venoms 
are highly diverse and are significandy homologous to 
toxins from a wide variety of venomous marine and terres- 
trial creatures such as fish, reptiles, other cnidarians, cone- 
snails, stinging insects and even a venomous mammal 
(Shrew), covering the complete range of pharmacological 
properties known in venoms, including cytolytic, neuro- 
toxic, haemotoxic, phospholipase, proteinase and protein- 
ase inhibitor activities. Both the number of toxins 
predicted in the venom of A. digitifera and the degree 
of homology to such widely divergent phyla is remark- 
able. Accordingly, cnidarian venoms may possess 
unique biological properties that might generate new 
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Table 16 UniProt-predicted homologues of animal venom proteins in the predicted proteome of A. digitifera 



Gene sequence UniProt toxin accession 



Animal with closest homology 



vl.01916 [+ 5 other sequence 
copies] 

vl. 06761 ;v1. 08075; v1. 09840; 

vl .20323 

v 1.04809 

vl .06380 

vl .10291 

vl.14412 



Q92035; Acetylcholinesterase 



Bungarus fasciatus (Banded Krait) 



Q9IAM1; Agkisacutacin (subunit anticoagulant protease) Deinagkistrodon acutus (Sharp-nosed Viper) 



vl .23477 
vl. 16440 

vl. 16571 [+ 10 other sequence 
copies] 

vl .06055 [+ 20 other sequence 
copies] 

vl .07831; vl. 10094 ;v1 .20732 

vl .01 708 [+ 5 other sequence 
copies] 

vl .09601 ;v1 .10410 

vl. 06821 

v1 .08924 

vl .061 89 [+ 112 other sequence 
copies] 

vl .02942 [+ 8 other sequence 
copies] 

v1 .00644 [+ 32 other sequence 
copies] 

vl .07446 

v1. 20653 

v1.02561,v1.1 1493; v1. 16681 

vl .1 3597; vl .08696; vl .1 0757; 

v1. 20654 

v1. 18386, v1. 15479 

vl .06094 

vl .0641 6; vl .1 6248; v1 .2371 2 
vl .17681 

v1. 00077 [+ 14 other sequence 
copies] 

v1. 12241; vl. 02332; v1. 12298 

vl .02245 [+ 19 other sequence 
copies] 

vl. 03638; v1. 14772 

vl .13106 

vl .1 1 1 32 

v1. 02168 

v1 .06910 

vl .22282 



A8QL52; L-Amino acid oxidase 

Q4JHE1; L-Amino acid oxidase 

P81383; L-Amino acid oxidase 

A6MFL0; L-Amino acid oxidase 

P81383; L-Amino acid oxidase 

P81382; L-Amino acid oxidase 

C5NSL2; Bandaporin (haemolysin) 

Q76B45 ; Blarina toxin (vasoactive protease) 

Q593B6; Coagulation factor V 

P14530; Coagulation factor IX 
Q4QXT9; Coagulation factor X 

Q93109; Equinatoxin-5 (cytolysin) 
Q08169 ; Hyaluronidase 
I0CME7; Hyaluronidase, Conohyal-Cn1 
Q9XZC0; ct-Latrocrustotoxin Lt1a (neurotoxin) 

G0LXV8; ci-Latrocrustotoxin Lh1a (neurotoxin) 

Q25338; A- Latroinsectotoxin Ltla (neurotoxin) 

A7X3X3; Lectin, Lectoxin Enh4 (platelet binding) 
A7X3Y6; Lectin, Lectoxin Enh7 (platelet binding) 
A7X3Z4; Lectin, Lectoxin Lio1 (platelet binding) 
A7X3Z7; Lectin, Lectoxin Lio2 (platelet binding) 

A7X413; Lectin, Lectoxin Lio3 (platelet binding) 
A7X406; Lectin, Lectoxin Phil (platelet binding) 
A7X3Z0; Lectin, Lectoxin Thr1 (platelet binding) 
Q6TPG9; Lectin, Mucrocetin (platelet binding) 

Q66S03; Lectin, Nattectin (platelet binding) 

Q71RQ1; Lectin, Stejaggregin-A (platelet binding) 
A0FKN6; Metalloprotease, Astacin-like toxin 

Q90391; Metalloprotease, Atrolysin 
D3TTC2; Metalloproteinase, Atragin 
Q7T1T4; Metalloproteinase, BjussuMP-2 
073795; Metalloproteinase, Disintegrin 
Q7SZE0; Metalloproteinase, Disintegrin 
P14530; Metalloproteinase, Disintegrin 



Bungarus fasciatus (Banded Krait) 
Pseudechis australis (Mulga Snake) 
Ophiophagus hannah (King Cobra) 
Demansia vestigiata (Lesser Black Whipsnake) 
Ophiophagus hannah (King Cobra) 
Calloselasma rhodostoma (Malayan Pit Viper) 
Anthopleura asiatica (Sea Anemone) 
Blarina brevicauda (Northern Short-Tailed Shrew) 

Pseudonaja textilis (Eastern Brown Snake) 

Protobothrops fiavoviridis (Okinawa Habu Snake) 
Tropidechis carinatus (Rough-Scaled Snake) 

Actinia equina (Beadlet Anemone) 

Apis mellifera (European Honey Bee) 

Conus consors (Singed Cone) 

Latrodectus tredecimguttatus (Mediterranean Black 
Widow Spider) 

Latrodectus hasseitii (Australian Redback Spider) 

Latrodectus tredecimguttatus (Mediterranean Black 
Widow Spider) 

Enhydris polylepis (Macleay's Water Snake) 
Enhydris polylepis (Macleay's Water Snake) 
Liophis poecilogyrus (Water Snake) 
Liophis poecilogyrus (Water Snake) 

Liophis poecilogyrus (Water Snake) 

Philodryas olfersii (Green Cobra) 

Thrasops jacksonii (Black Tree Snake) 

Protobothrops mucrosquamatus (Brown Spotted Pit 
Viper) 

Thalassophryne nattereri (Toad Fish) 

Trimeresurus stejnegeri (Bamboo Viper) 
Loxosceles intermedia (Recluse Spider) 

Crotalus atrox (Western Diamondback Rattlesnake) 
Naja atra (Chinese Cobra) 
Bothrops jararacussu (Jararacussu Pit Viper) 
Gloydius brevicaudus (Chinese Mamushi Snake) 
Gloydius saxatilis (Rock Mamushi Snake) 
Protobothrops fiavoviridis (Okinawa Habu Snake) 
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Table 16 UniProt-predicted homologues of animal venom proteins in the predicted proteome of A. digitifera 

(Continued) 



vl. 03804 Q2UXQ5; Metalloproteinase, EoVMP2 

vl.02016 Q9151 1; Mucrofibrase-5, Hypotensive serine protease 



vl . 041 53; v1. 04595; v1. 12730; 

v1 .04157 

v1 .12433 [+ 5 other sequence 
copies] 

v 1.000 19; v1 .13757 
v1. 09322; vl. 09961 ;v1. 13629 
vl .03556 

v1. 13015; vl. 16921 
v1 .18628 
vl .1 1 796 
vl .09883 
vl. 14874 
vl .1 1 797 

vl .07278 [+ 34 other sequence 
copies] 

vl .1 1045 

vl.04104 [+ 5 other sequence 
copies] 

vl .00387 [+ 9 other sequence 
copies] 

vl .021 37 [+ 38 other sequence 
copies] 

vl.00618 [+ 10 other sequence 
copies] 

v 1.09896 

vl. 13726 

v1 . 021 29; v1 . 05362; vl. 20273 

v1 .06980; v 1.09028 

vl.21284 [+ 5 other sequence 
copies] 

vl. 18895 [+ 20 other sequence 
copies] 

v1. 14251; v1. 10489; v1. 14254 

vl .06759 [+ 7 other sequence 
copies] 

v1 .01273 
vl. 09855; vl. 09856 

vl .16247 
vl. 08397; v1 .09733 

vl .03275 

vl .16638 
v1. 22320 

vl. 15074 [+ 4 other sequence 
copies] 



vl. 09026 Q7ZZN8; Natrin-2 (neurotoxin) 

A0ZSK3; Neoverrucotoxin (haemolysin) 



A2VBC4; Phospholipase A1 

Q06478; Phospholipase A1 1 
P0CH47; Phospholipase A1, Magnifin 
P53357; Phospholipase A1 2 
D2X8K2; Phospholipase A2 
Q9TWL9; Phospholipase A2, Conodipine-M 
Q9PUH9; Phospholipase A2, Acidic S9-53 F 
Q8AXW7; Phospholipase A2, Basic 
Q90WA8; Phospholipase A2, Basic 2 
P20256; Phospholipase A2, Basic PA-12C 
Q7SZN0; Prothrombin activator Pseutarin-C 

P83370; Prothrombin activator Hopsarin-D 
Q58L94; Prothrombin activator Notecarin D2 

Q58L90; Prothrombin activator Omicarin C 

Q58L91; Prothrombin activator Omicarin C 

Q58L93; Prothrombin activator Porpharin D 

P81428; Prothrombin activator Trocarin D 
A6MFK7; Prothrombin activator Vestarin D1 
Q6T269; Protease inhibitor, Bitisilin-3 (neurotoxic) 
Q3SB05; Pseudechetoxin (neurotoxin) 
D8VNS7; Ryncolin-1 (haemostasis inhibitor) 

D8VNS8; Ryncolin-2 (haemostasis inhibitor) 

D8VNS9; Ryncolin-3 (haemostasis inhibitor) 
D8VNT0; Ryncolin-4 (haemostasis inhinitor) 

Q9YGN4; Salmorin toxin (haemostasis inhibitor) 

B2DCR8; SE-Cephalotoxin 

013060; Serine protease, 2A 

Q9DF66; Serine protease, 3 (haemostasis inhibitor) 

Q9DG84; Serine protease, Serpentokallikrein-2 
(haemostasis inhibitor) 

Q7SYF1; Serine protease, Cerastocytin (platelet binding) 
P0C5B4; Serine protease, Gloshedobin (platelet binding) 
B2D0J4; Serine protease, Venom dipeptidyl peptidase 4 



Echis ocellatus (West African Carpet Viper) 

Protobothrops mucrosquamatus (Brown Spotted Pit 
Viper) 

Naja atra (Chinese Cobra) 
Synanceia verrucosa (Reef Stone Fish) 

Polybia paulista (Neotropical Social Wasp) 

Dolichovespula maculata (Bald-Faced Hornet) 

Vespa magnifica (Giant Hornet) 

Dolichovespula maculata (Bald-Faced Hornet) 

Condylactis gigantean (Giant Caribbean Sea Anemone) 

Conus magus (Magical Cone) 

Austrelaps superbus (Lowland Copperhead Snake) 

Micrurus corallinus (Painted Coral Snake) 

Bungarus fasciatus (Banded Krait) 

Pseudechis australis (Mulga Snake) 

Pseudonaja textilis (Eastern Brown Snake) 

Hoplocephalus stephensii (Stephen's Branded Snake) 
Notechis scutatus (Tiger Snake) 

Oxyuranus microlepidotus (Inland Taipan ) 

Oxyuranus scutellatus (Coastal Taipan) 

Pseudechis porphyriacus (Red-Bellied Black Snake) 

Tropidechis carinatus (Rough-Scaled Snake) 
Demansia vestigiata (Lesser Black Whipsnake) 
Bitis gabonica (Gaboon Viper) 
Pseudonaja textilis (Eastern Brown Snake) 
Cerberus rynchops (Dog-Faced Water Snake) 

Cerberus rynchops (Dog-Faced Water Snake) 

Cerberus rynchops (Dog-Faced Water Snake) 
Cerberus rynchops (Dog-Faced Water Snake) 

Gloydius brevicaudus (Chinese Mamushi Snake) 

Sepia esculenta (Golden Cuttlefish) 

Trimeresurus gramineus (Bamboo Viper) 

Protobothrops jerdonii (Jerdon's Pit Viper) 

Protobothrops mucrosquamatus (Brown Spotted Pit 
Viper) 

Cerastes cerastes (Saharan Horned Viper) 
Gloydius shedaoensis (Shedao Pit Viper) 
Apis mellifera (European Honey Bee) 
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Table 16 UniProt-predicted homologues of animal venom proteins in the predicted proteome of A. digitifera 

(Continued) 



vl .05361 


B6RLX2; Serine protease inhibitor, TCI (neurotoxin) 


Ophiophagus hannah (King Cobra) 


v1 .10994 


B7S4N9; Serine protease inhibitor, Taicatoxin (neurotoxin) 


Oxyuranus scutellatus (Coastal Taipan) 


vl.1 1218; vl. 23374 


Q90WA0; Serine protease inhibitor, Textilinin-2 (thrombin 
inhibitor) 


Pseudonaja textilis (Eastern Brown Snake) 


vl. 17856; vl. 22256 


Q8T3S7; Serine protease inhibitor, U1-aranetoxin-Av1a 
(neurotoxin) 


Araneus ventricosus (Devil Spider) 


vl.04154 [+ 4 other sequence 
copies] 


Q98989; Stonustoxin (haemostasis inhibitor) 


Synanceja horrida (Estuarine Stonefish) 


v1 .09427; v1 .1 661 9; vl .1 9446 


Q76DT2; Toxin AvTX-60A (cytolysin) 


Actineria villosa (Okinawan Sea Anemone) 


vl. 12311 


Q9GV72; Toxin CrTX-A (haemolysin) 


Carybdea rastonii (Jimble Jellyfish) 


vl .07546 [+ 5 other sequence 
copies] 


P58911; Toxin PsTX-60 (haemolysin) 


Phyllodiscus semoni (Night Anemone) 


vl.1 1270; vl .14265 


E2IYB3; Veficolin-1 (complement activator) 


Varanus komodoensis (Komodo Dragon) 


vl.02115 


Q98993; Verrucotoxin (cytolysin) 


Synanceja verrucosa (Reef Stonefish) 



leads in the discovery of novel pharmacologically active 
drugs. Gene duplication followed by mutation and nat- 
ural selection is widely held as the key mechanism 
whereby the large diversity of toxins found within a 
single venom could have evolved [458,459] . Conversely, 
primary mRNA splicing patterns have been shown to 
account for the diversity of metallopro-teinases in the 
pit viper Bothrops neuwiedi [460]. Variations in peptide 
processing have also been shown by proteomics and 
transcriptomics to explain how a limited set of genes 
transcripts could generate thousands of toxins in a sin- 
gle species of cone snail [461]. Despite these various 
processes that could account for the evolution of toxin 
diversity, it has never been demonstrated how gene du- 
plications or variations in transcript or peptide pro- 
cessing could have radiated across the very different 
poisonous creatures found on Earth. Our data 
(Table 16) reveal that the predicted toxins of A. 
digitifera venom are orthologues to all of the most im- 
portant superfamilies of peptide/protein venoms found 
in diverse taxa. We posit that the origins of toxins in 
the venoms of higher organisms may have arisen from 
deep eumetazoan innovations and that the molecular 
evolution of these venom super gene families can now 
be addressed taking an integrated venomics approach 
using Cnidaria such as the jellyfish as model systems 
[462]. 

Detoxification proteins of the chemical defensome 

There have been considerable advancements made to 
better understand the effects of pollution on coral reef 
habitats. The three main categories of environmental 
pollutants from anthropogenic sources are nutrient 
enrichment (eutrophication), hydrocarbon pollution and 
heavy metal contamination. Eutrophication from terrestrial 
inputs are a significant threat to coral reefs stemming from 



the discharge of treated sewage, the runoff of agricultural 
fertilizers (plus herbicides and pesticides), and by sedimen- 
tation caused by the erosion of organic-rich soils [463]. 
Notwithstanding that eutrophication can shift coral reef 
communities towards macroalgae domination [19], nitro- 
gen and phosphorus enrichment can diminish coral growth 
and affect the photosynthetic performance of their algal 
symbionts [464]. Nutrient enhancement alters multiple 
pathways of primary metabolism that in coral is compli- 
cated by the photosynthetic demands of its symbiotic part- 
ners. While corals respond to hypertrophic levels of 
nutrients by activating general stress-response proteins 
[465], there are no specific proteins known to mitigate the 
cellular effects of nutrient enrichment on corals per se, and 
we have not attempted to identify such in this study. 

Gene families and their regulators that defend against 
chemical stressors comprise the chemical defensome 
encoding a network of detoxifying proteins that allows 
an organism to sense, transform and eliminate potentially 
toxic endogenous metabolites and xenobiotic contaminants 
[466]. Expressed proteins of the chemical defensome in- 
clude the biotransformation cytochrome P450 (CYP) family 
of enzymes, conjugating enzymes, efflux transporters, heavy 
metal membrane pump exporters and their transcriptional 
activators. Annotation of the genome of A digitifera reveals 
multiple genes encoding 20 hemoproteins belonging to the 
Phase II cytochrome P450 superfamily of monooxidase 
enzymes that catalyse the oxidation of diverse organic 
substances (Table 17). The substrates of CYP enzymes in- 
clude intermediates of lipid metabolism and sterol/steroid 
biosynthesis, and include the detoxification of exogenous 
xenobiotics. Of significance are the CYPlA-type (aryl 
hydrocarbon hydroxylase) enzymes that have been studied 
widely in the hepatic response of fishes to polycyclic aro- 
matic hydrocarbon (PAH) contamination (from crude or 
fuel oil) and exposure to polychlorinated biphenyl and 
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Table 1 7 Proteins of the chemical defensome in the predicted proteome of A. digitifera 



Gene sequence 


KEGG Orthology 


Encoded protein description 


v1 .06127; vl.06128 


K01 01 5 


Alcohol sulfotransferase 




vl .09267 


K00537 


Arsenate reductase 




vl .24496; vl .24495; vl .03953 


K03893 


Arsenical pump membrane protein 


vl.10691 


K07755 


Arsenite methyltransferase 


vl .20443 


K11811 


Arsenical resistance protein ArsH 


vl. 14972 


K01551 


Arsenite-transporting ATPase 


vl .1 7644; vl .00480; vl .081 50; vl .22865 


K01014 


Aryl sulfotransferase 




vl.21535; v1.1 1835; vl. 02456 


K01534 


Cd 2+/ Zn 2+ -exporting ATPase 


v1 . 03485; v1. 21 926; vl. 05686 


K01533 


Cu 2+ -exporting ATPase 




vl .22646 [+ 8 other sequence copies] 


K07408 


Cytochrome P450, fami 


y 1, subfamily A, polypeptide 1 


vl.01284 


K07421 


Cytochrome P450, fami 


y 2, subfamily T 


vl. 1 0544; v1. 02314, v1 .17490 


K07422 


Cytochrome P450, fami 


y 2, subfamily U 


v1 .23039 [+ 13 other sequence copies] 


K07422 


Cytochrome P450, fami 


y 3, subfamily A 


vl .07750 


K07425 


Cytochrome P450, fami 


y 4, subfamily A 


vl .22798; vl .23000 


K07426 


Cytochrome P450, fami 


y 4, subfamily B 


vl .02020 [+ 4 other sequence copies] 


K07427 


Cytochrome P450, fami 


y 4, subfamily V 


vl.19495 


K07428 


Cytochrome P450, fami 


y 4, subfamily X 


vl. 15382 


K15002 


Cytochrome P450, fami 


y6 


vl. 16427 


K07430 


Cytochrome P450, fami 


y 7, subfamily B 


vl. 17631 


K00498 


Cytochrome P450, fami 


y 11, subfamily A 


v1 .08074 [+ 4 other sequence copies] 


K15004 


Cytochrome P450, fami 


y 12 


vl .02478 [+ 5 other sequence copies] 


K00512 


Cytochrome P450, fami 


y 1 7, subfamily A 


vl.06713 


K07435 


Cytochrome P450, fami 


y 20, subfamily A 


v1 .22414 [+ 5 other sequence copies] 


K07436 


Cytochrome P450, fami 


y 24, subfamily A 


V1.20153 


K12665 


Cytochrome P450, fami 


y 26, subfamily C 


vl .08074 [+ 6 other sequence copies] 


K00488 


Cytochrome P450, fami 


y 27, subfamily A 


vl .06537 


K07439 


Cytochrome P450, fami 


y 39, subfamily A 


v1 .22302 [+ 5 other sequence copies] 


K07440 


Cytochrome P450, fami 


y 46, subfamily A 


vl.16335 


K09832 


Cytochrome P450, fami 


y 710, subfamily A 


vl. 18439; v1 . 02594; v1. 02593 


K01016 


Estrone sulfotransferase 




v1 .07758 [+ 5 other sequence copies] 


K00699 


Glucuronosyltransferase 




vl .00764 


K13299 


Glutathione S-transferase kappa 1 


vl.17188 


K00799 


Glutathione S-transferase 


vl .04140 


K07239 


Heavy-metal exporter, HME family 


vl .10181 


K00481 


p-Hydroxybenzoate 3-monooxygenase 


v1. 1 6748; vl. 07471 


K08365 


MerR family transcriptional regulator, mercuric resistance 


v1. 04382; vl. 24424 


K13638 


MerR family transcriptional regulator, Zn(ll)-responsive 


vl. 12760 


K08363 


Mercuric ion transport protein 


vl .041 79; vl .01891; vl.00145 


K03284 


Metal ion transporter, MIT family 


vl .21 500 [+ 5 other sequence copies] 


K01253 


Microsomal epoxide hydrolase 


vl .08005 


K08970 


Nickel/cobalt exporter 




vl .03484 


K08364 


Periplasmic mercuric ion binding protein 


vl .05406 


K07245 


Putative copper resistance protein D 
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Table 17 Proteins of the chemical defensome in the predicted proteome of A. digitifera (Continued) 

vl. 14635 K08726 Soluble epoxide hydrolase 

v1. 01 929; vl. 19296 K05794 Tellurite resistance protein TerC 

vl .1 0880; vl .1 5709; vl .1 2348 K07803 Zinc resistance-associated protein 



dibenzodioxin toxicants (reviewed in [467]). CYP450 activ- 
ity has been detected in the corals Favia fragum [468], 
Siderastrea siderea [469], Montastraea faveolata [470] and 
Pocillopora damicornis, [471]. Furthermore, CYP encoding 
sequences have been extracted from the genome of N. 
vectensis [472] and the transcriptome of A. millepom [29]. 
As well as providing chemical defence, mixed-function 
CYPs perform multiple endogenous tasks that are often 
taxon-specific. Hence, the orthology and substrate spe- 
cificity of coral CYP enzymes cannot be predicted solely on 
homology to CYPs of known function assigned to higher 
metazoans. Similar to the function of CPY enzymes, there 
are genes encoding p-hydroxybenzoate 3-monooxygenase, 
an oxidoreductase catalyzing aryl oxidation and the soluble 
and microsomal forms of epoxide hydrolase that converts 
epoxides, formed by the degradation of aromatic com- 
pounds, to trans-diols that by conjugation are readily 
excreted. Conjugating enzymes to eliminate hydroxylated 
substrates are the detoxifying UDP-glucuronosyltransferase 
and sulfotransferase families of enzymes. Estrone 
sulfotransferase is significant for inactivation of exogenous 
(contraceptive) estrogens [473] and similar endocrine- 
disruptive contaminants released from treated wastewater 
[474]; their occurrence in marine waters are known to dis- 
rupt the reproduction and development of fish [475] and 
corals [476]. Glutathione S-transferase (GST) enzymes 
catalyse the addition of reduced glutathione to the reactive 
sites of electrophilic toxins [477]. Surprisingly, only two 
isoforms of GST were detected in the A. digitifera genome 
(Table 17), whereas 18 distinct GST-encoding genes (6 clas- 
ses + 1 fungal-type) were classified from genome sequences 
of N. vectensis [472] . This unexpected genome reduction of 
GST elaboration in A. digitifera begs further examination. 

Many toxicological studies on the effects of pollution on 
cnidarian fitness have focused on their response to heavy 
metal contamination, including copper, cadmium, mer- 
cury and zinc [478,479]. In scleractinian corals the uptake 
and toxic effects of copper [480-483], cadmium [482] and 
mercury [484,485] have been studied at the metabolic level 
with specific studies to examine the effects of heavy metal 
toxicity on coral fertilisation [486-488], settlement [487], 
metamorphosis [486] and in coral bleaching [489]. Yet, 
the identification of molecular markers to monitor the re- 
sponse of Cnidaria to sub-lethal levels of heavy metal 
exposure has been elusive [490]. We were delighted to un- 
cover in our annotation a wide range of genes to express 
metal-specific (arsenic, copper, mercury, nickel/cobalt and 
tellurium) resistance, transportation and membrane pump 



exporting proteins that, together with non-specific heavy 
metal ion export proteins (Table 17), might prove useful 
for monitoring the environmental response of A. digitifera 
to heavy metal contamination. Included in the heavy metal 
defensome are the Mer-family of transcriptional regulators 
of Hg- and Zn-resistance proteins and a periplasmic ion- 
binding protein attributed to the Hg detoxification system 
of bacteria [491]. Enzymes specific for arsenic detoxifica- 
tion are an arsenate oxidoreductase for conversion of ar- 
senate to arsenite [492] and arsenite methyltransferase for 
conversion of arsenite to the less toxic dimethylarsenite 
that is amenable to excretion [493]. Such processes may 
enhance the resilience of corals exposed to natural [494] 
and site-affected [495] levels of arsenic contamination. In 
contrast, there were no (organo) cyanide detoxification 
genes apparent in the A. digitifera genome, but one se- 
quence (vl.01601; K10814) encodes for hydrogen cyanide 
synthase of unknown metabolic purpose (data not tabu- 
lated). Ancillary evidence suggests that the expression of 
HCN synthase could be linked to quorum sensing [496] 
for regulating microbial densities of the coral holobiont 
community. 

Epigenetic and DNA-remodelling proteins 

In all Kingdoms of life, DNA methylation and chromatin 
remodelling is pivotal to the regulation of gene transcrip- 
tion independent of underlying allelic variation. One such 
process mediated by epigenetic changes in eukaryotic biol- 
ogy is the all-important cellular differentiation during mor- 
phogenetic development. Epigenetic modifications cause 
the activation, regulation or silencing of certain genes with- 
out changing the basic DNA code. Changes in epigenetic 
regulation can persist during cell division and across mul- 
tiple generations [497]. In addition, cytosine methylation 
may be associated with a higher mutation rate, because de- 
amination of the methylated base produces thymine 
resulting in C/T mutations, which on reproduction may be 
transmitted by the germline to subsequent generations in 
selective processes of evolution [498]. On the other hand, 
environmentally induced destabilisation of the epigenome 
can produce epigenetic gene variants (epialleles) that acti- 
vate transcription and mobilization of DNA transposable 
elements, which may subsequendy lead to stable heritable 
traits of environmental adaptation, as does occur by genetic 
imprinting in plants [499]. Transposition has thus the po- 
tential to direct increased frequencies of permanent genetic 
mutations for selective adaptation. 
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One way by which genes are regulated at the epigenome 
is through the remodelling of the chromatin histone-DNA 
complex (the nucleosome), which by post-translational 
modification changes the template structure of DNA as- 
sociated histone proteins. These modifications are affected 
by histone-lysine (and histone-arginine) N-methyltransfer- 
ase enzymes (Table 18) by which these proteins may be 
further modified by acetylation, ADP-ribosylation, ubi- 
quination, and phosphorylation (annotation not tabulated). 



The methylation pattern of histone lysine residues is highly 
predictive of the gene expression states of transcriptional 
activation and repression [500]. Necessary epigenomic re- 
programming of histone modification at different stages of 
cell development is affected by the activation of histone 
and lysine-specific demethylase enzymes (Table 18). Deter- 
minants for recognition of the histone code are being re- 
vealed by a growing body of experimental data providing 
valuable information on the molecular tractability of 



Table 1 8 Epigenetic and DNA-remodelling proteins in the predicted proteome of A. digitifera 





Gene sequence 


KEGG Orthology 


Encoded protein description 




v 1.04426; v1 .02042 


K02528 


16S rRNA (adenine1518-N6/1519-N6)-dimethyltransferase 




vl. 22358; v1 .00249 


K14191 


18S rRNA (adenine1779-N6/1780-N6)-dimethyltransferase 




v 1.1 9400; v1 .04238 


K00561 


23S rRNA (adenine2085-N6)-dimethyltransferase 




vl.05107; v1 .05242 


K01488 


Adenosine deaminase 




v1 .04152; v1 .09790 


K14857 


AdoMet-dependent rRNA methyltransferase SPB1 




v1 .00197 


K13530 


AraC family transcriptional regulator DNA methyltransferase 


vl 


12967; v 1.1 9789; v1 .07763 


K14589 


Cap-specific mRNA (nucleoside-2'-0-)-methyltransferase 1 




v1 .24281 


K01489 


Cytidine deaminase 


vl .1 621 1 ; vl 


14952; vl. 01 094; v1 .06983 


K00558 


DNA (cytosine-5-)-methyltransferase 


vl 


19683; v 1.05688; v1 .04223 


K11324 


DNA methyltransferase 1 -associated protein 1 


vl.14033; v1 


19860; vl. 1 9081 ;v1. 041 88 


K11420 


Euchromatic histone-lysine N-methyltransferase 




v1 .02068 


K01487 


Guanine deaminase 




v1 .02920 


K05931 


Histone-arginine methyltransferase CARM1 


vl. 17589 [+ 7 other sequence copies] 


K11446 


Histone demethylase JARID1 




v1 .07640 


K06101 


Histone-lysine N-methyltransferase ASH1L 


vl .1 351 5; v1 


18577; vl. 201 87; vl .19182 


K09186 


Histone-lysine N-methyltransferase MLL1 




v1 .08381 


K091 87 


Histone- ysine N-methyltransferase MLL2 




vl. 24258; vl.19182 


K09188 


Histone-lysine N-methyltransferase MLL3 


vl 


07992; vl. 10302; v1. 13829 


K09189 


Histone-lysine N-methyltransferase MLL5 


vl 


06939; vl. 15255; v1. 15254 


K11424 


Histone-lysine N-methyltransferase NSD1/2 




v1 .05552 


K11422 


Histone-lysine N-methyltransferase SETD1 




v1 .07744 


K11423 


Histone-lysine N-methyltransferase SETD2 




v1 .03190 


K11431 


Histone-lysine N-methyltransferase SETD7 




vl .21867 


K11428 


Histone-lysine N-methyltransferase SETD8 


vl. 18700 [+ 


8 other sequence copies] 


K11421 


Histone-lysine N-methyltransferase SETDB 




v1 .07557; v1. 11409 


K11419 


Histone-lysine N-methyltransferase SUV39H 




vl .24733; v1. 13497 


K11429 


Histone-lysine N-methyltransferase SUV420H 


vl .15405; vl.10291; vl 


1 7601; v1. 02845; v! .08629 


K11450 


Lysine-specific histone demethylase 1 


vl .23155; v1 .09394; vl .1 7624; v1 .05370 


K14835 


Ribosoma RNA methyltransferase Nop2 


vl. 18460 [+ 6 other sequence copies] 


K03500 


Ribosoma RNA small subunit methyltransferase B 




vl. 07407; vl.031 10 


K08316 


Ribosoma RNA small subunit methyltransferase D 




vl.12193 


K02427 


Ribosoma RNA large subunit methyltransferase E 




v1 .1 1499 


K11392 


Ribosoma RNA small subunit methyltransferase F 




vl. 16053; v1. 12676 


K03437 


RNA methyltransferase, TrmH family 
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Table 18 Epigenetic and DNA-remodelling proteins in the predicted proteome of A. digitifera (Continued) 





V I . I 2453, V I .Uj4jy 


K I 3uy/ 




V I .u/oyz 


K0745 1 




i/1 HQICi/1 11113 

V 1 .2 lo 1 3, VI. I / I 1 3 


l/T\r\C£lC 

l\UU365 


vl 


.06363; vl. 03360; vl.21218 


K05925 




V1 .09661 


K07442 


v1 


.08094; v 1.04036; v1. 18614 


K03256 


vl 


.1 1456; v 1.00738; v1 .04577 


K03439 




v1 .08042 


K14864 




vl .20501 


K00557 




vl .15147 


K14964 




v1 .08925 


K00571 



binding sites involved in epigenetic signalling [501], which 
will enhance further insight to epigenetic function. 

Direct epigenetic modification of DNA (or mRNA) 
occurs by methylation of cytosine, and to a lesser extent 
adenosine and guanine, by nucleobase-specific DNA 
methyltranferases (Table 18) to give 5-methylcytosine (5- 
meC), 3-methyladenosine (3-meA) and 3-methylguanine 
(3-meG) nucleotides, respectively. The principal modifica- 
tion product, 5-methylcytosine behaves much like regular 
cytosine by pairing with guanine, but in areas of high cyto- 
sine methylation, genome transcription is strongly re- 
pressed (reviewed in [502]), together with the repression 
of other chromatin-dependent processes, including the in- 
corporation of transposable elements [503]. Alteration in 
the methylation status of the entire genome, individual 
chromosomes or at specific gene sites is essential for nor- 
mal cellular function, but processes for reprogramming 
methylated DNA at different stages of cell development, 
unlike the reversal of histone modifications, is poorly 
defined [504]. While there are abundant enzymes to repair 
DNA damage caused by spurious N-alkylation, direct nu- 
cleotide C-demethylation (via the hypothetical "DNA 
demethylase" [505]) is thermodynamically infeasible. In- 
stead, removal of epigenetic C-methylated nucleobases oc- 
curs by several base-repair pathways involving DNA 
excision or mismatch repair enzymes. The genome of A. 
digitifera encodes expression of a specific DNA glycosylase 
enzyme [506] for excision of 3-meA, but there are no such 
enzymes encoded for the excision of 5-meC and 3-meG, 
although there is encoded a 5-methylcytosine-specific re- 
striction enzyme. Another pathway for DNA demethyla- 
tion requires base-specific deamination by the AID/ 
Apobec family of deaminase enzymes that, for example, 
converts 5-meC to thymine that is replaced subsequently 
by cytosine by C/T mismatch repair enzymes. These meth- 
ylated nucleobases are recognized for deamination by the 
cytosine, adenosine and guanine deaminase enzymes [507] 
that are encoded in the A. digitifera genome, and their 
deaminated bases are subsequently removed by DNA 



Methylcytosine dioxygenase 

5-Methylcytosine-specific restriction enzyme A 

mRNA (guanine-N7-)-methyltransferase 

mRNA (2'-0-methyladenosine-N6-)-methyltransferase 

tRNA (adenine-NI-)-methyltransferase catalytic subunit 

tRNA (adenine-N(1)-)-methyltransferase non-catalytic subunit 

tRNA (guanine-N7-)-methyltransferase 

tRNA methyltransferase 

tRNA (uracil-5-)-methyltransferase 

Set1/Ash2 histone methyltransferase subunit ASH2 

Site-specific DNA-methyltransferase (adenine-specific) 

mismatch repair enzymes. Additionally, the genome of A. 
digitifera encodes a methlycytosine dioxygenase enzyme 
that converts 5-methylcytosine to 5-hydroxymethycytosine 
(5-hmC), which is recognized for removal by the base exci- 
sion repair pathway [508] or via its 5-hmC deaminated 
intermediate [507]. Combined, these DNA demethylation 
pathways are able to remodel epigenetic modifications at 
different stages of cell development. 

Most current knowledge on DNA and protein methy- 
lation comes from studies of mammals and plants, while 
our understanding of the extent and roles of DNA 
methylation in invertebrates, marine invertebrates in 
particular, is still limited [509]. Little is known about the 
epigenetic potential of corals to acclimatize and adapt to 
the thermal and synergistic stressors that cause wide-spread 
coral "bleaching" [510]. Yet, given that acclimatization 
occurs via the generation of epiallele variants that can in 
some instances lead to stable heritable traits of environ- 
mental adaptation, there is growing interest in the pro- 
spect that epigenetic modifications in corals or their algal 
symbionts [511] may drive adaptation to defend against 
the damaging threat imposed by rising temperatures from 
global climate change. It is anticipated that this field of 
study will rapidly accelerate with the need to better under- 
stand epigenetic processes that may contribute to the per- 
sistence of coral reefs. 

Conclusions 

We offer ZoophyteBase as an unprecedented foundation 
to interrogate the molecular structure of the predicted A. 
digitifera proteome. Some key findings include proteins 
with relevance to host-symbiont function, dysfunction and 
recovery including those that direct vacuolar trafficking 
and proteins linking symbiont photosynthesis to coral 
calcification. An extensive catalogue of mammalian-like 
proteins essential to neural function and venoms related to 
distant animal phyla suggests their origins lie deep in early 
eumetazoan evolution. Homologues of prokaryotic genes 
that have not been described previously in any eukaryote 
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genome such as flagella proteins, proteins essential for ni- 
trogen fixation and photosynthesis point towards lateral 
gene transfer, perhaps mediated by viruses, that may lead 
to "shared" metabolic adaptations of symbiosis, and pro- 
vide corals with limited ability for gene-encoded adaptation 
to a changing global environment. It is anticipated that un- 
derstanding how the genome of a coral hosts interacts with 
that of its vast array of symbionts, and how it may regulate 
its metabolic quotient, for example through biochemical or 
epigenetic modification, will rapidly accelerate our ability 
to predict the fate of coral reefs. 

Availability and requirements 

ZoophyteBase was constructed using the Metagenome/ 
Genome Annotated Sequence Natural Language Search 
Engine (MEGGASENSE). This is a general system for the 
annotation of sequence collections and presentation of the 
results in a database that can be searched using biologic- 
ally intuitive search terms. In this implementation, the 
predicted proteome of A. digitifera (genome assembly v. 
1.0 [48]) was used as the source of protein sequences. The 
annotation was carried out using the KEGG database (re- 
lease v58 [51]) to relate A. digitifera protein sequences to 
KEGG orthologues. The homologous protein sequences 
were used to construct hidden Markov model (HMM) 
profiles using the HMMER3 package [49]. The predicted 
proteome sequences of A. digitifera were searched with 
the HMM profiles to link proteins to appropriate KEGG 
orthologues [50,512]. A web interface was developed with 
various tools. The search platform Lucene/Solr [52] was 
used to implement natural language searches. Protein 
sequences provided by the user can be used for BLAST 
[50] searches against the coral proteome. Selected se- 
quences of the coral proteome can be analysed with third 
party software (e.g. [53]) to interrogate conserved do- 
mains. ZoophyteBase is deployed using Apache-Tomcat 
(version 7.0.28 for Linux x64 [513]) on the Ubuntu Linux 
server of the Section of Bioinformatics at the Faculty of 
Food Technology and Biotechnology, University of Zagreb, 
Croatia and is accessible at our published web address [47]. 
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Additional file 1: Table S16b. Predicted (UniProt) homologues of 
animal toxins encoded in the genome of A digitifera. 
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